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f^ ' Given any regularly varying dislocation measure, we identify a 

fs^ , natural self-similar fragmentation tree as scaling limit of discrete frag- 

. ■ mentation trees with unit edge lengths. As an application, we obtain 

i-i^ ' continuum random tree limits of Aldous's beta-splitting models and 

j^ , Ford's alpha models for phylogenetic trees. This confirms in a strong 

way that the whole trees grow at the same speed as the mean height 
of a randomly chosen leaf. 

(N 
^ ■ 1. Introduction. For a number of years, there has been an increased 

^D . interest in random tree models, both in the mathematical literature and 

J^ I in applied sciences such as genetics. Fundamental classes of trees are trees 

.^ ■ with n leaves and no degree-2 nodes. Denote by T° the space of such trees, 

^) \ which can be made mathematically precise as a space of connected acyclic 

^ ' graphs with n -|- 1 degree- 1 vertices, one of which is distinguished as the 

■5^ . root. Also, denote by T„ the space of such trees where the other n degree- 1 

1 -^ \ vertices (the leaves) are labeled 1, . . . ,n. Such trees are called cladograms in 

C^ ' the genetics literature, up to trivial differences and an extension. Here, the 

P^ . trees are planted, that is, the root has only one neighbor, and they are not 

^ I necessarily binary, as is usually assumed in the phylogenetics literature. The 

only edge adjacent to the root is called the root- edge. 

A class of probability distributions on T° or T„ can be specified by a 

C^ ' procedure called Markov branching [1]: Pi is the unique distribution on the 

singleton T^ . Recursively, P° is the distribution of a random tree T° , where 
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the unique branch point neighboring the root connects r > 2 subtrees with 

ki> ■ ■ ■ >kr>l leaves, respectively, ki-\ \-kr = n, with some probability 

qn{ki, . . . , kr) (so that g„ is a probability distribution on the set of partitions 
of the integer n). Given the branching into sizes ki,. . . ,kr, these subtrees are 
independent, with distributions P^ . Finally, define P„ as the distribution 
of the random tree T°, equipped with leaf labels uniform among all possible 
labelings with {l,...,n}. Therefore, all Markov branching models on T„, 
thus defined, have exchangeable leaf labels. 

Important models in phylogenetics such as the Yule, uniform and comb 
models, and, more generally, Aldous's beta-splitting models [1, 5] and Ford's 
alpha models [21] have the Markov branching property (see [1, 21] for refer- 
ences to the phylogenetics literature). They also have a property of sampling 
consistency, that is, the subtree of T„ ~ Pn generated by the leaves labeled 
1, . . . , n — 1 has distribution Pn-i- 

By a standard argument using Kolmogorov's extension theorem, for sam- 
pling consistent {Pn)n>i, one can consider a strongly sampling consistent 
sequence (T°)„>i [resp. (T„)„>i] defined on some probability space, in the 
(stronger) sense that T°_^ is the subtree of T° ~ P° generated by n — 1 
leaves chosen uniformly at random (resp. leaves 1, . . . , n — 1) for all n > 2. 

The recursive definition of Pn is due to Aldous [1] in the binary case, 
where qn is supported by partitions of n of the form {n — k,k),l <k < n/2, 
for all n. Not all Markov branching models are sampling consistent [e.g., 
a splitting rule for which (74(2,2) = 1 cannot be sampling consistent] and 
Aldous leaves as an open problem the characterization of sampling consistent 
Markov branching models (on cladograms). Ford [21] gives an answer in 
terms of a certain consistency condition that has to be satisfied by the 
associated (binary) splitting rules g„. See also (14) for the general nonbinary 
case. 

A more explicit answer in the form of an integral representation, also for 
nonbinary models, can be obtained from Bertoin's study of homogeneous 
fragmentation processes [8]. In the present paper, we interpret sampling 
consistent Markov branching models as trees associated with (discrete) frag- 
mentations, where T° ~ P° describes the fragmentation of an initial mass 
of size n (or of the set {l,...,?i} for Tn ~ Pn), first into blocks of sizes 
ki,. . . ,kr and then of each block, recursively, until all blocks have unit size. 
These branching models can be characterized in terms of homogeneous frag- 
mentation processes, as follows. 

A homogeneous fragmentation process is a continuous-time continuous- 
mass analog of the above discrete fragmentations. The most intuitive class 
is that of mass fragmentation processes, that is, right-continuous Markov 
processes (i*t)t>o in 



isi)i>i : si > S2 > • • • > 0, V Si < 1 
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whose transition kernels have the property that given state s = (si)j>i at 
time u, each fragment of mass Sj evolves independently and with distribution 
identical to {siFt)t>o. More precisely, for each t > 0, Fu+t can be written 
as the decreasing rearrangement of masses of all fragments of SiFf, i > 1. 
Bertoin has shown that the distribution of such a process is determined 
by an erosion coefficient c G M+ and a dislocation measure v on S^ which 
satisfies 

(1) z^({(l,0,...)}) = and f {I - si)iy{ds) < oo. 

In the sequel, for s G 5-'', we let sq = 1 — J2i>i ^i ^ [0) ^■ 

Theorem 1. Sampling consistent splitting rules {qn,n > 2) are all of the 
following form: if (ki, . . . ,kr) is a partition of n with r > 2 parts, of which 
exactly rrij > parts are equal to i, 1 <i <n, then 

qt''\k,,...,kr) 
(2) 

distinct 

for some pair {c,u), where c>0 and v satisfies (1). Here, Ck-^,...,kr ^-^ (^ 
combinatorial factor and Zn the normalization constant, as follows: 



n\ 

(3) Cfc,,...,fc, = 77-j r-j— i T' Zn = nc 

kii . . . kr\mi\ . . . mn\ 



^l(^-T.'^-^i'")- 



Moreover, one has q-h''^ = qn '^ for every n>2 if and only if {d ,v') = 
{Kc, Kv) for some K > 0. 

The intuitive meaning of (2) is that (sj)j>o is chosen according to u and 
an independent sample from (sj)j>o of size n is taken, jointly conditioned 
not to have all sample points in one fragment Sj, i > 1. The term sq is special 
in that each of the I sample points in sq is considered a singleton. We note 
that 

where (; i, "" t ) is the number of permutations with the same given fre- 
quencies l,ki,. . . ,kr-i. The allocation of indices ij to box sizes kj is such 
that (ttt-i — 0- nr=2 "^«' sequences (ii, . . . , ir-i) lead to the same configuration 
{{ii, ki), . . . , {ir-i,kr-i)} and hence contribute to the coefficient of the same 
monomial SqS^^ " ' ^i'^l' ■ 
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A homogeneous fragmentation process for which c = 0, is said to have no 
erosion. Also, a dislocation measure v is said to be conservative if 

(4) jy({sGcS^:so>0})=0. 

The conditions that c = and z/ is conservative are equivalent to the dust- 
free property of the associated homogeneous fragmentation {Ft)t>o, namely 
that the terms of Ft sum to 1 for any t > 0, a.s. Under these conditions, 
formula (2) takes the much simpler form 



(5) 

1 / n 




Zr,\h,---,krJ\\}-'^''l Js 



X,. E n^&K^«)- 



«l,...,ir>li = l 

distinct 



In [9], Bertoin introduced self-similar fragmentation processes {F^ )t>Q 
in S^ with parameter a S M: given state s at time u, the evolution of each 
fragment of mass Sj is independent and distributed as (sjFa. )t>o- Once a 

i 

is fixed, such processes are in one-to-one correspondence with homogeneous 
fragmentations and are hence characterized by an erosion parameter c and 
a dislocation measure v satisfying (1). In the sequel, we will only deal with 
negative index a < and write 7 = — a. 

For 7 > 0, c = and conservative dislocation measures ly, associated frag- 
mentation trees Tr^,^) have been studied in [29] using Aldous's continuum 
random tree (CRT) formalism of trees as subsets of li = /i(N) (cf. [2, 3, 4]). 
Alternative tree representations have been developed and we shall here use 
abstract M-trees as introduced for use in probability by Evans and co-authors 
[18, 19, 20] (see also [27]). Following these references, the space of M-trees 
will be endowed with the Gromov-Hausdorff metric, which provides a no- 
tion of convergence for these abstract spaces. All the necessary concepts are 
discussed in Section 3.3. 

Under the regular variation condition 

(6) u{si<l-e) = e~^^£(- 

for some 7,^ S (0, 1) and a function x -^ £{x) slowly varying as x ^ 00, the 
case 7 = 7,y is special. Under the further regularity condition 

(7) / y^Si\lnisi)\Piy{ds)<oo 

for some p> [this is satisfied, e.g., if ^{sr+i > 0) = for some r > 0], our 
main theorem identifies the 7,y-self-similar fragmentation tree as a scaling 
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limit of discrete fragmentation trees associated with a (homogeneous) frag- 
mentation process or, equivalently, by Theorem 1, associated with samphng 
consistent Markov branching models. 

Theorem 2. Let u he a conservative dislocation measure satisfying (6) 
and (7) and (T°)„>i a strongly sampling consistent family of discrete frag- 
mentation trees T° ~ P° as associated via (5). If we consider T° as a random 
M-iree (with unit edge lengths), then there is the convergence in probability 

T° ip), ^ 

with respect to the Gromov-Hausdorff metric, where Ti^^^y\ is a ^y- self- 
similar fragmentation tree with dislocation measure v , defined as a random 
R-tree on the same probability space that supports {T°)n>i- 

Note that we obtain a convergence of objects with constant edge lengths to 
objects which, heuristically, may be expected to have "shorter" edge lengths 
close to the leaves, where the fragmentation rate goes to infinity. Here, we 
find that all sufficiently regular dislocation measures v have an intrinsic self- 
similarity parameter 7jy, which gives a natural scale for the whole tree. As 
an application, we obtain limiting continuum random trees for alpha and 
beta-splitting models. In [1], Aldous introduced a wide class of sampling 
consistent binary Markov branching models, via splitting rules qn{n — k,k), 
1 < ^ < [n/2\ , n > 2, which he symmetrized to model a planar order so that 
Qnik) = Qn{iT' — k) = ^qn{n — k,k) ioT 1 <k < n/2 and qn{k) = qn{n — k, k) if 
n = 2k is even. That is, ijn is the distribution of a block selected by a fair 
coin toss from the split of a block of size n. He then studied in more detail 
the one-parameter family 

1 fn\r{p + k + l)T{(3 + n-k + l) 



^-^'^=^Lrky^'-^^^-''''^ 



Zi^^\k) T{n + 2[i + 2) 

where (3 > —2. This beta-splitting model satisfies the conditions of Theorem 
2 for —2 < (3 < —1 with 7 = — /? — 1. As an important case, when (3 = —3/2, 
the tree T„ is uniform on the binary trees of T„. Thus, we reobtain Aldous's 
theorem [2], stating that the scaling limit of uniform random variables on 
T„ is the celebrated Brownian continuum random tree, with self-similarity 
index —1/2. This will be discussed in more detail in Sections 2.4 and 5.1. 

In [21], Ford studied a model based on a simple sequential construction 
as follows. The tree T^ is the unique single-leaf tree in Tl. Given T°, choose 
one of its edges according to a weight 1 — a for an edge between a leaf and 
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another vertex, and a weight a for an edge between two other vertices. Spht 
the edge in two, introduce a new vertex between the two edges and add 
another edge from the new vertex to a new leaf. The new random tree is 
cahed r°_,_^ . It is imphcit in the work of Ford that this model satisfies the 
conditions of Theorem 2 if < a < 1, with 7 = a, so there is a CRT limit. 
We discuss this in more detail in Sections 5.2-5.3. 

Section 2 carefully introduces the discrete framework and establishes the 
characterization of sampling consistent splitting rules in terms of dislocation 
measures of homogeneous fragmentation processes (Theorem 1). Section 3 
introduces the fragmentation CRTs that appear as limits in Theorem 2. 
Section 4 establishes the proof of Theorem 2. Specifically, we check finite- 
dimensional convergence for Theorem 2 (Proposition 7) , provide a tightness 
estimate (Proposition 9) that allows the extension of finite-dimensional con- 
vergence to convergence in the Gromov-Hausdorff sense (Section 4.2) and 
give a version of Theorem 2 for convergence of height functions (Theorem 
15), allowing a planar order and a mass measure to be carried over to the 
limiting CRT. The latter convergence was conjectured by Aldous [1] in the 
special case of the beta-splitting models. Section 5 concludes with applica- 
tions to alpha, beta-splitting and stable trees. 

2. Markov branching models and discrete fragmentations trees. Our 

goal in this section is to identify the sampling consistent Markov branching 
models on labeled trees with laws of trees that are naturally associated with 
homogeneous fragmentations. As first discussed in Bertoin [8], a convenient 
way to study homogeneous fragmentation processes is to use a "discretiza- 
tion of space." This amounts to considering processes that take their values 
in the set V of partitions of the set N = {1,2, . . .}, rather than in S^. To 
study these, we need some terminology and notation. 

2.1. Partitions. For S C N, we let Vb denote the set of partitions of B 
into disjoint nonempty blocks, so V = Vn. For vr G Vb, we write -B' G vr to 
indicate that B' is a block of tt and ? ~ j to indicate that i,j £ B belong to 
the same block of vr. We let 7ri,7r2, . . . be the blocks of vr ranked by order 
of least element, so tti is the block containing the least element of B, tt2 
is the block containing the least element not in tti and so on, with the 
convention that tt^ = if vr has strictly fewer than k blocks. Thus, any 
element vr of Vb can be represented as a sequence (7ri,7r2, . . .), which might 
eventually be constant, equal to 0. We also let vr^j) denote the block of vr 
that contains the integer i€ B.lfir £ Vb and i?' C N, we let ttIb' = -B' n vr be 
the partition of B' DB obtained by restricting vr to the elements of B' f] B. 
We let 7r|„ = 7r|r„i for every n > 1, where [n] = {1, . . . ,n}. By convention, we 
let 1b be the trivial partition {B, 0, . . .) of i? and 0^ = ({«i}, {^2}) • • •) be 
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the partition of B into singletons, where ii < 12 < ■ ■ ■ is the ranked Ust of 
elements of B. 

In the sequel, the set V will be endowed with the distance A(7r,7r') = 
2-A^(vr,7r')^ where A^(7r,7r') = sup{n > l:7r|„ = 7r'|„} G NU {0,00}, and the 
associated Borel a-algebra. 

We say that a partition vr G Vb is finer than vr' S Vb , and write tt ^ tt' , if 
any block of vr is included in some block of vr'. This defines a partial order 
:< on Vb ■ A process or a sequence with values in Vb is called refining if it 
is decreasing for this partial order. 

2.2. Trees. There is a natural relation between trees with labeled leaves 
and refining partition- valued processes. Write i3 C/ N if S is a finite subset 
of N. For B GfN with n elements, we let T^ denote the set of t, where each 
t is a collection of subsets of B and also contains ROOT G t, such that: 

• B € t — we call B the common ancestor in t; 

• {i} £ t for all i G B — we call {i}, i £ B, the leaves of t; 

• for all A,C et, either AnC = 0, or A<^ C or C <^ A. 

A G t is called a descendant of C G t if A C C, and C is then called an 
ancestor of yl. A set A is called a child of C and C is called the parent of 
^ if ^ C C and for all £> G t with ACDCC either A = D or D = C. If we 
equip t with the parent-child relation and also relate ROOT with B £t, then 
t is a rooted connected acyclic graph so that Tr„i can be identified with T„ 
in the notation of the Introduction. 

For t G Tb and C G t with k children Ai,. . . jAj^. G t, (^1, . . . , Aj^.) is a 
partition of C. We can define the subtrees t Ai , . . . , t a^. pending from C as 
t^. = {root} [j{Aet:AC Ai}. Then t^. is an element of T^. for 1 < i < A:. 
Conversely, for any finite sequence of trees ti G Tbi,- ■ ■ ,tk G Tb,., where 
Bi, . . . ,Bk are the nonempty blocks of a partition of some B C/ N, we define 

(tl,...,tfc)={B}UUf=lt^GTB. 

Definition 1. Let {iT(t),t > 0) take values in Vb for some B CjN and 
be refining. Assume, further, that 7r(0) = 1b and 7r(t) = 0^ for some finite 
t > 0. We define the associated fragmentation tree to be tj^ = {root} U {A C 
B-.Ag 7r{t) for some t > 0}. 

A similar association can be made for refining sequences (7r(0),7r(l), . . . , 
-7r(?7i)) of partitions of some B C/ N starting at 7r(0) = 1b and ending at 
TT^m) = Ob- 

2.3. Homogeneous fragmentations. If 11 is a random variable with val- 
ues in Vb, then we say that 11 is exchangeable if its law is invariant under 
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the natural action of the permutations of B. Similarly, a 'PB-valued pro- 
cess (n(t),t > 0) is exchangeable if its law is invariant under the action of 
permutations of B. 

Definition 2. Let i? C N and consider a T^^-valued Markov process 
(n(i), t > 0). We assume that for every t, t' > 0, the distribution of n(t + 1'), 
given n(t) = vr, is the same as that of the random partition whose blocks 
are given by 

where 7r^,7r^, ... is an i.i.d. sequence of exchangeable partitions of N. Then 
the process 11 is called a homogeneous fragmentation of B. 

When a homogeneous fragmentation of B starts from the trivial partition 
1b oi B, we say that the process is standard. We will also assume nondegen- 
eracy of the process, namely that it is not constant a.s. It is then elementary 
from the definition that (nondegenerate) homogeneous fragmentation pro- 
cesses are refining processes whose blocks all decrease to singletons. In view 
of the preceding section (Definition 1), this allows us to introduce the fol- 
lowing definition. 

Definition 3. Let (n(t),t > 0) be a standard homogeneous fragmen- 
tation of some finite B gN. The tree Tb := tn G T^ is called the discrete 
fragmentation tree associated with 11. 

As argued by Bertoin, a P-valued process 11 is a homogeneous fragmenta- 
tion if and only if its restrictions to [n] are homogeneous fragmentations of 
[n], n > 1. In other words, homogeneous fragmentations of N are the same 
as consistent families of homogeneous fragmentations of [n], n > 1. Obvi- 
ously, this amounts to a consistency property for the associated sequence 
T'[n]! '^ > 1, of fragmentation trees, namely that T^^] is the tree obtained 
from Ti^_^_i] by removing the leaf with label n + 1 (and the internal vertex 
if it has only two other neighbors, that will then be connected by a direct 
edge instead). We claim that the laws {Pn,n > 1) associated with sampling 
consistent splitting rules as explained in the Introduction are in one-to-one 
correspondence with the sequence of distributions of trees T[,„] , n > 1 , asso- 
ciated with some homogeneous fragmentation of N. 

Before we tackle this (in Proposition 3), we need some more notation. Let 
s = {sj,j G N) G M^ have total sum X^jeN ^j ^ 1- ^Y setting sq = 1 — J2jeN ^J7 
we define a probability mass function {sj)j>q on NU {0}. Independent ran- 
dom variables {Ii,i > 1) with probability mass function (sj)j>o can be in- 
terpreted as an urn scheme, with urns labeled by N and a "dustbin" with 
label 0. 
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As shown by Bertoin, (the laws of) standard homogeneous fragmentations 
of N are in one-to-one correspondence with cj-finite measures k. on V that 
satisfy 

k{{tt € V : 7r|„, ^ 1[„,]}) < oo for ah n > 1 

and which informahy correspond to the jump measures of the fragmentation 
processes. We cah such measures dislocation measures on P. As shown in 
[8], such measures admit the fohowing, simple, representation. For s G 5-'^, 
we let Kg be the distribution on V of the random variable IT obtained by 
Kingman's paintbox construction: let /i,/2 . • • be i.i.d. with law {sj)j>o and 
let i,j be in the same block of IT if and only if i = j or Jj = Ij > 0. Then for 
every dislocation measure KonV, there exists c > and a measure u on 5-'- 
satisfying (1) such that 

oo 



(8) ^(dvr) = / Ks{dTr)i'{ds) + c y^ (5^ (rfvr) , 

where ej is the partition of N into two blocks {i} and N\ {i}. 



2.4. Characterization of sampling consistent Markov branching models. 
We are now almost ready to give the proof of Theorem 1 . For any distribution 
Qn on partitions of the integer n (splitting rule), and for B with n elements, 
we define the associated exchangeable splitting rule on Vb \{'^b}, which is 
the probability distribution on Vb defined by 




(9) qB{-K)=[ {T[mil]qn{ki,...,k 



^r) 



whenever vr is a partition of B with r nonempty blocks of sizes ki> ■ ■ ■ >kr, 
block size i appearing with frequency rrij, 1 <i <n. Informally, this is what 
we obtain when uniformly choosing a partition of Vb that is compatible with 
a partition of n that has been sampled according to qn ■ It is elementary that 
a random partition with law qb is exchangeable. 

Also, it is plain that the laws {Pn,n > 1) on labeled trees associated with 
(not necessarily sampling consistent) splitting rules {qn,n > 2) can also be 
described as follows. Define Puy to be the Dirac mass on the only element 
of Trji, in the notation of Section 2.2. Then, recursively, define Pb as the 
law of (ti, . . . ,tr), where vr is taken at random according to qb and, given 
IT = (vTi, . . . , vr,., 0, . . .) with TT,. / 0, ti, . . . , tr are picked independently in 
T,rj , . . . , Ttt^ with respective laws Pm , . . . , P^r ■ Then P„ = P[„] . Moreover, 
{qn,n > 2) is sampling consistent if and only if the image distribution of 
P[n+i] by the operation that removes the leaf with label n + 1 is Pr„i . 
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Proposition 3. Sampling consistent splitting rules {qn,n > 2) are in 
one-to-one correspondence with dislocation measures k on Vn of homoge- 
neous fragmentation processes (modulo constant multiples) . 

More precisely, for any {qn)n>2, the formulas A2 = 1, 

(10) A.+1 ^"^ 



l-9[n+l]({l,---,'T'}, {" + !}) 

and 

(11) Ac({rGP:r|„ = 7r})=A„g[„](7r), 

for all TT E "Pui \ {Ifji]}, define a dislocation measure k onV . 



Conversely, for any dislocation measure k, 
K({r£p:r|B=7r}) 
K{{VeV:Y\B^lB]y 



(12) qB{^)= J}^^^:J^,';, , vtGPbMIb} andSC/N, 



defines an exchangeable sampling consistent splitting rule. 

Moreover, if Yi is a homogeneous fragmentation process with dislocation 
measure k, then the sequence of distributions of the discrete fragmentation 
trees T^^], n>l, is exactly {Pn,n > 1), as associated with {qn,n > 2). 

Proof. Let (n(i))j>o be a homogeneous P-valued fragmentation pro- 
cess with dislocation measure k. For B C/ N, let qs be the distribution of 
vr = II\b{Db), where Db = inf{i > 0:111^(0 7^ 1b}- It is plain that the qB 
are exchangeable. Thus, they specify partition-valued splitting rules. We de- 
note the associated "unlabeled" splitting rules (i.e., on partitions of n) by 
qn, n>2. 

By the strong Markov property ([8]) applied at time Db, given vr = 
(vTi, . . .,TTr,0,. ■ .) with TTr ^ 0, the processBS (n|,ri(-DB + t),t> 0) for 1 < 
i <r are independent and, respectively, have the same law as Him, 1 <i <r. 
Prom this, we see that the discrete fragmentation tree Tg = tnu has distri- 
bution Pb associated with the splitting rules qB- Sampling consistency for 
the splitting rules qn,n> 2, follows immediately from the property that Tr,„i 
is obtained from T^^+i] by deletion of the leaf with label n + 1. It is argued 
in Bertoin [8] that (jb is indeed given by the formula (12) in the case B = [n] 
and the general case follows by exchangeability. 

Conversely, a sampling consistent system of Markov branching models 
allows us to consider a strongly sampling consistent system of trees r„ ~ P„ , 
n > 1. Note that T„ and T„_|_i are related in one of two possible ways: with 
probability p,i+i := g[„_)_|]({l, . . . ,n},{n + 1}), the branch point adjacent to 
the root in T„+i splits into {1, . . . ,n} and {n + 1} and has T„ as a subtree; 
with probability 1 —pn+i, the branch point closest to the root in T^+i can 
be identified with the branch point closest to the root of T„. Necessarily, if 
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P„,n > 0, can be obtained from some homogeneous fragmentation process 
n, then the holding rates A„ = E[L'r„i]^^,n > 1, for the state lr„i of the 
Markov process n|„ should thus satisfy 

^{D[n] = D[n+i]) = 1 - Pn+1 and, on {!)[„] / -C>[„+i]}, 

D[n] ~-C'[n+l] +Dn, 

where l).„ is independent of i^[n+i] and exponential with rate A„. Taking 
expectations, we get 

K^ = {l-Pn+l)Kll+Pn+liKll + K^) =^ Xn = il-Pn+l)K+l- 

If we arbitrarily put A2 = 1, this determines (A„)„>3 from {qn)n>2- Further- 
more, by the same reasoning, we get, for all vr G "Pj^j, 

^H (tt) = q[n+i\ ({r G P[.„+i] : r|„ = vr}) 
(13) 

+ q[n+i] ({1, . . . , n}, {n + l})g[„] (vr), 

that is, after rearrangement and multiplication by A„+i, 

Ang[n] (tt) = A„+i^[.„+i] ({r G Vyn+l\ : r|„ = vr}) 

SO that we can define k consistently by (11) as a cr-finite measure on Vn- 

Finally, for the uniqueness, note that the choice of A2 was arbitrary and 
any other choice A2 G (0, 00) leads to a constant multiple of k, that is, a linear 
time change of an associated fragmentation process. It is easily checked that 
if K is defined by (10) and (11) for any A2 G (0,oo), then (12) holds for 
B = [n] and then for B C [n]; and if g[„] is defined by (12), then (10) and 
(11) hold with A„, = K({rGpN:r„ /![„]}). D 

The consistency equation (13) can be written in terms of g„ as 



in\Ki,. 



, fcr) = V -f —^ gn+i((/ci, . . . , fcj + 1, . . . , krY) 

^^ (n + l)mk. 



(14) H — -g„+i(A;i,...,A;^,l) 

re + 1 

H —-qn+i{n, l)qn{ki, ..., K), 

n+\ 

which is structurally similar to but not the same as, the backward recur- 
sions for the rows of the decrement matrix associated with coalescents with 
simultaneous multiple collisions; see [13]. The binary special case was al- 
ready obtained by Ford [21], Proposition 41, and can be compared with 
coalescents with no simultaneous but multiple collisions, as in [14]. See also 
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[24, 25] for similar recursions in the context of regenerative composition and 
partition structures. 

Proof of Theorem 1. The fact that ah samphng consistent sphtting 
rules are of the stated form is now a simple exercise using (12), (8) and (9). 
The theorem will be proven if we show that c, v can be recovered from the 
dislocation measure n associated, up to a constant multiple, with a sampling 
consistent splitting rule as in Proposition 3. Obviously, c = K({ei}), so we 
can assume c = in (8). We then use Kingman's paintbox construction to 
obtain that «;-almost every vr has an asymptotic frequency, and that the 
restriction of k to ^^ := {vr G "P : maxj |7rj| < 1 — e} is finite with total mass 
rris = z^({s G 5-'' : si < 1 — e}). Then the probability measure v{- Pi {s : si < 
1 — e})/77ie is just the distribution of IvrlJ- under k(- n A^)/m^ so that v is 
recovered from k. D 

The binary special case is worth discussing separately. It is characterized 
by those dislocation measures that have the property 

k{{-k = (vri, 7r2, . . .) G Pn : vri U 7r2 / N}) = 0. 

Writing k = kq + cX]j>i '^{{i},N\{«}) (for the highest c such that kq is a non- 
negative measure) and using the one-to-one correspondence of dislocation 
measures on Vn and pairs of erosion coefficient c > and dislocation measure 
V on 5-'^, these correspond to {c,v) with 

(15) c>0 and i/({(si)i>i e 5^ : «! + ^2 < 1}) = 0. 

The presentation is nicest for a symmetrized setting. We define Ty{A) = 
^(i/(si £A) + u{s-2 G A)) for Borel sets A C [0, 1]. 

Corollary 4. Sampling consistent binary splitting rules qn,n > 2, are 
in one-to-one correspondence (modulo constant multiples) with pairs (c, z/) 
satisfying (15). 

Specifically, for any {c,^), 

1 <k <n — l, where Zn = Jiq ^-.(l — x" — {l — x)'^)C'(dx)-\-nc is the normaliza- 
tion constant, induces a sampling consistent splitting rule by qn{n — k,k) = 
Qn{k) + Qnin — k) , 1 < k <n/2, g„,(n/2,n/2) = g„,(n/2), n even. 

The symmetric splitting rules qn{k) [for c = and u{dx) = f{x)dx ab- 
solutely continuous] give Aldous's (planar) Markov branching models and 
Corollary 4 shows that, essentially, Aldous had found all binary exchangeable 
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sampling consistent Markov branching models without erosion and expressed 
them in terms of (a density of) a binary dislocation measure. 
Vice versa, we can calculate c and v from n~^ Znqn{^) -^ c and 

(16) Z„ Y. in{k)= E f [f] x\l - xr-H{dx) ^ v{[aM) 

an<k<bn an<k<bn 

for any continuity points < a < 6 < 1 for i>, provided Z^ (or another nor- 
malization sequence Zn^ Zn) can be calculated. The proof, which is easily 
done using de Finetti's representation for exchangeable sequences of O's and 
I's, is left as an exercise to the reader. 

3. Self-similar fragmentations and continuum trees. In this section, we 
set the bases needed to prove convergence of discrete fragmentation trees 
to some continuum random trees that are naturally related to the so-called 
self-similar fragmentations [9, 10]. 

3.1. Self- similar fragmentations. A nice feature of exchangeable parti- 
tions in the case S = N is that Kingman's theory [31] entails that the blocks 
of such a partition 11 admit asymptotic frequencies almost surely, namely 

#n, n [n] 

IL- := Imi . 



n — ^oo ji 

We let ]n] = (lnj],i > 1) and \Ii\^ be the random element of S^ obtained 
from ]n] by ranking its terms in decreasing order. 

Let (n(t),t > 0) be an exchangeable cadlag (right-continuous with left 
limits) P-valued stochastic process such that 11(0) = In, and ln(t)] exists 
for every t > 0, a.s. Suppose, also, that the process of sizes of the block 
containing i, (]n/j\(t)],t > 0), is right-continuous for every i G N a.s. 

Definition 4. The process (n(t),t > 0) is a P-valued self-similar frag- 
mentation process with index a S M if, given n(t) = tt, the random variable 
n(f + s) has same law as the random partition whose blocks are those of 
TTj n n(*'(]7rj]°s),i > 1, where (Il^'^i > 1) is a sequence of i.i.d. copies of 

(n(t),i>o). 

When a = 0, we recover the definition of standard homogeneous frag- 
mentations in V. To avoid trivialities, we will only work with nonconstant 
processes. We notice that if (n(t),t > 0) is a self-similar "P-valued fragmen- 
tation, then (]n(t)]-'',f > 0) is a self-similar fragmentation with values in S\ 
as defined in the Introduction (and any 5-'^-valued fragmentation can be 
represented in this form; see [7, 9]). Bertoin has shown in [9] that "P-valued 
self-similar fragmentations are characterized by a triple (a, c, z/), where c > 
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and i^ is a dislocation measure (1) on SK Hereafter, we will only be interested 
in the cases where c = and i^ is conservative (4) (no sudden loss of mass 
and no erosion). We call (a, z^) the characteristic pair of such self-similar 
fragmentations . 

There is a useful way to relate self-similar fragmentations to homogeneous 
fragmentations, which is as follows. 

Lemma 5 ([9]). Let (li^ {t),t > 0) be a standard homogeneous fragmen- 
tation with dislocation measure v and let a (zM. We then define a sequence 
of time changes, 

(17) r](^i){t)=mi<u>0: r\U°^--^{w)\^''dw>t\, t>0,i>l. 

Let n"(t) be the element of V whose blocks are those of the partitions 
nO.)(7?(,)(t)), i>l. Then: 

(i) the process (n"(t),f > 0) is a self-similar fragmentation with charac- 
teristic pair {a,v); 

(ii) for the size |^9j^(t)| of the block containing i, the process i(i){t) 



log|n9.N(i)|,t > is a pure- jump subordinator with Levy measure 



(18) A{dx) = e ''^i^{{sj)j>i€S^ -.-logSiGdx). 
Thus, |n^.)(t)| =exp(-^(i)(?7(i)(t))), where 

(19) r/(i)(t)=inf|n>0: re''^(^)("')(iu'>A, t > 0. 

We refer to [9] for the proof of this result. Note that because the parti- 
tions n''(t) are refining as t increases, if two of the blocks of the partitions 
U^-Jrjf^i^lt)), i > 1, have a common element, then they are equal and the 
definition of n"(t) makes sense. 

3.2. Trees with edge lengths. Let (n(t),t > 0) be a self-similar fragmen- 
tation process with index a. We may then construct a family of random 
trees Tb indexed by B C/ N, defined by Tg = tni^, the fragmentation tree 
associated with the restrictions of (n(t),t > 0) to i? (see Definition 1). The 
time-change construction of Lemma 5 provides a coupling of all self-similar 
fragmentations with same dislocation measure and different indices a S M, 
all with the same Tb- The only difference lies in the times at which splits 
occur, which do not appear in Tb- These times provide extra information 
on the tree associated with a fragmentation process, which we can interpret 
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as edge lengths associated with the fragmentation tree Te for a particular 
index a G M. 

A (rooted labeled) tree with edge lengths is a pair -;? = (t, {r]e,e G E{t))), 
where t G T^ for some B CjN, E{t) is the set of edges of t and {r]e,e G 
E{t)) G (O,cxo)-^'*) are positive marks, interpreted as the lengths of the as- 
sociated edges. The tree t is called the shape and we let Qb be the set of 
trees with edge lengths whose shape is in T^. 

Let (7r(t),t > 0) be a refining process with values in V, starting at 1^. 
Assume, further, that 

Di := L'jj} = inf{s > : {i} G vr(s)} < oo for all i > 1, 

so that in particular, 7r|B(t) = 0^ for some finite t. Recall that a vertex v of 
any t G T^ is naturally identified with the set B^, of labels of the leaves that 
descend from v. We are going to make this identification in the sequel. 

For B C/ N, we let 6*^1^ G ©b be the tree with edge lengths whose shape 
is t^i^ and whose edge lengths are r/g = D^ — £)-,„ whenever e G Eit^^^) is 
the edge linking a nonroot vertex v and its parent ^v. Notice that whereas 
D^ = inf{i > : Il\By (t) / 1b„ } for a nonleaf vertex v only depends on Ulsy , 
Di is defined differently and depends on the whole process (7r(t), t > 0) rather 
than its restriction to {i} or B. 

Now, suppose that (n(i),t > 0) is a self-similar fragmentation with dis- 
location measure i^ and index a < 0. By [9], it holds that < Di < oo a.s. 
for every i and, in fact, sup^yi Di < oo in that case. Therefore, TZb = ^n|s 
is well defined. This tree was called TZ{B) in [29], Section 2.3, where it was 
constructed slightly differently. We conclude this section by establishing the 
link between the two presentations. 

If 1? G @B has a root-edge e with length i]e and if x <rje, then we let i9 — x 
be the element of 0^ with the same shape and edge lengths, except for 
the root-edge, which is assigned length r]e — x. If iDi, . . . ,'dr are elements of 
@Bi ,■■■ , ©Br with shapes ti , . . . , t^ for pairwise disjoint nonempty Bi C/ N, 
and if x > 0, we let 

be the element of ©UiSi whose shape is (ti, . . . ,tr), whose root-edge length 
is X and whose other edge lengths are inherited from those oi "di, . . . ,'dr in 
the natural way. 

The trees TZb can be recursively described in the following way. Let 7?.|j} 
have as shape the only element of Tjji and (single) edge length equal to 
Duj . Then let 

TZb = {T^Bi - Db, ■ ■ ■ jTZBr - Db)db^ 

where Bi, . . . ,Br are the nonempty blocks of the partition of B induced by 
II{Db)- This is the definition of [29], Section 2.3. 
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3.3. Continuum trees and fragmentation processes. 

3.3.1. M-trees. We now introduce the continuous version of trees that is 
needed to deal with continuum random trees, following [4, 19]. 

An M-tree (r, d) is a complete separable metric space such that for every 

1. there is an isometry (px,y'-['^,d{x,y)] -^ t such that '^x,y{^) =x and 
^x,y{d{x,y)) =y; 

2. for every injective path c: [0, 1] — > r with c(0) = x, c(l) = y, one has 

c{[0,l])=ipx,y{[0,d{x,y)]). 

In other words, there exists a geodesic in r linking any two points and 
this geodesic is the only simple path linking these points (up to reparame- 
terization). We usually denote by [[x,?/]] the range of ipx,y This is indeed a 
continuous analog of the graph-theoretic definition of a tree as a connected 
graph with no cycle. The R-trees we will be considering are also rooted, that 
is, they have a distinguished element which we denote by p. 

We say that two rooted R-trees {T,p,d), {t' ,p',d') are equivalent if there 
exists an isometry from r onto r' that sends the p to p' . We denote by 
the set of equivalence classes of compact rooted R-trees. It has been 
shown in [19] that is a Polish space when endowed with the so-called 
pointed Gromov-HausdorjJ distance dcHi where, by definition, the distance 
dGniiT, p), (t', p')) is equal to the infimum of the quantities 

6{r,r')ydn{T,T'), 

where {T,r),(T' ,r') are isometric embeddings of {t,p),{t' ,p') into a com- 
mon metric space (M, 6) and St-i is the Hausdorff distance between compact 
subsets of {M,5). It is elementary that this distance does not depend on 
particular choices in the equivalence classes of (r, /)) and {t',p'). We endow 
Q with the associated Borel a-algebra. In the sequel, by a slight abuse of 
notation, we will still call elements of rooted M-trees, and we will denote 
them by r, omitting mention of the root and the distance d. Also, by a 
probability measure on an element r G 0, we will mean an equivalence class 
of a 4-tuple {T,p,d,p), where we call {T,p,d,p) and (t' ,p' ,d' ,p') equivalent 
if there exists an isometry from {T,p,d) to {T',p',d') such that p' is the 
push- forward of p. 

If r G 0, then and for x € r, we call the quantity d{p, x) the height of x. 
li x,y £ T, we say that x is an ancestor of y whenever x S [[p, y]]- We let 
X Ay £ T he the unique element of r such that [[p, x]] n [[p, y]] = [[p, x A y]] 
and call it the highest common ancestor of x and y in r. For x G r, we denote 
by Tx the set of y £ t such that x is an ancestor of y. The set Tx, endowed 
with the restriction of the distance d and rooted at x, is in turn a rooted 
R-tree called the fringe subtree of r rooted at x. 



CONTINUUM TREE ASYMPTOTICS OF DISCRETE FRAGMENTATIONS 17 

We say that x £ t, x ^ p, in a rooted M-tree is a leaf if its removal does 
not disconnect r and we let C{t) denote the set of leaves of r. A branch 
point is an element of r of the form x Ay, where x is not an ancestor of 
y, nor vice versa. It is also characterized by the fact that the removal of a 
branch point disconnects the M-tree into three or more components (two or 
more for the root). We let B{t) denote the set of branch points of r. 

3.3.2. Relation with trees with edge lengths, reduced trees. There is a 
natural connection between the trees with edge lengths with shape in T° 
(resp. Tb) of the previous sections and rooted R-trees with n leaves (resp. 
^B leaves labeled by B) and where the root is not a branch point. If r 
is a rooted R-tree with p ^ 13{t) and exactly n leaves labeled Li, . . . ,L„, 
then we consider the graph whose vertices are the set V = {p} U C{t) U B{t) 
and such that two vertices x,y are connected by an edge if and only if 
[[x,y]] n V = {x,y}. The resulting graph is a tree which is naturally rooted 
at p and the edge connecting x and y naturally inherits the length d{x,y) = 
\d{p,x) — d{p,y)\. This construction can be reversed, associating an R-tree 
with a tree with edge lengths, for example, by means of Aldous's sequential 
construction [4], page 252. 

Also, if t is an element of T° or T^, one naturally puts edge lengths equal 
to 1 on each edge and considers t as an R-tree as well, the restriction of the 
distance of this R-tree to the set of branch points, leaves and the root being 
the usual combinatorial distance on the vertices of t. 

For T a rooted R-tree and xi,X2, ■ ■ ■ ,Xn£T, we let 

n 

R{T,Xi,...,Xn) = [j[[p,Xi\] 
i=\ 

be the reduced subtree associated with r, xi, . . . ,rE„. It is elementary that 
R{T,x\,...,Xn) is, in turn, an R-tree, which is naturally rooted at p and 
whose leaves are included in {xi, . . . , Xn} (it might be that xi is not a leaf of 
the reduced tree whenever Xj is an ancestor of Xj for some j ^i, but note that 
this never happens if xi, . . . ,Xn are distinct leaves of r). By the discussion 
of the previous paragraph, if r is such that p ^ B{t) and if xi, . . . ,x„ are 
leaves of r, then R[t,xi, . . . ,x„) can also be considered as a tree with edge 
lengths, whose shape is in T„, since the leaves inherit a natural labelling 
from that of xi, . . . , x„. 

3.3.3. Continuum random trees and fragmentation trees. The fragmen- 
tation trees introduced in [29] are yet another way to consider self-similar 
fragmentation processes whose characteristic pair {—^,v) satisfies 7 > 0. In 
order to introduce them, we first need to give some definitions and results 
on continuum trees, following [4]. 

We say that a pair (r, p) is a continuum tree if r G 9 and pis a probability 
measure on r such that: 
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1. /i is supported by the set C{t); 

2. fi has no atom; 

3. for every x £ t\ C{t), ^i{tx) > 0. 

Notice that continuum trees automatically satisfy a number of properties. 
For example, the set C{t) must be uncountable (by 1 and 2) and cannot have 
isolated points (by 2 and 3). 

A continuum random tree (CRT) is a "random variable" whose values are 
continuum trees, defined on some probability space {Q,A,¥). To formalize 
this, we should endow the set of continuum trees with a cr-algebra. A natural 
possibility would be to use Evans' and Winter's separable and complete 
metric structure [20] on the space of "weighted M-trees," although we would 
have to incorporate the fact that our trees are rooted. Another, probably 
even more natural, approach would be to use the Gromov-weak topology on 
the set of metric measure spaces introduced in the recent work of Greven, 
Pfaffelhuber and Winter [27]. 

However, for technical simplicity, in this paper, we prefer to follow Aldous 
[4] and use the space li = li(N) as a base space for defining our CRTs. 
Namely, we endow the set of compact subsets of /i with the Hausdorff metric, 
and the set of probability measures on /i with any metric inducing the 
topology of weak convergence, so that the set of pairs (T, /i), where T is 
a rooted M-tree embedded as a subset of h and /.t is a measure on T, is 
endowed with the product c-algebra. Convergence for the Hausdorff metric 
for subsets of h is stronger than convergence of the associated equivalence 
classes for the Gromov-Hausdorff topology. In the sequel, we always keep 
in mind that the usual probability "operations" such as conditioning, for 
example, with respect to fi, and then sampling i.i.d. random variables with 
law fi, are done by using this Zi-embedded measurable representative before 
taking isometric equivalence classes. In this sense, given (T,/i), we will call 
an i.i.d. sequence Li,L2,... with common law n an exchangeable sequence 
with directing law /U. 

For a > and {T,p{T),d) G 0, we denote by ar the element {T,p{T),ad) 
obtained by scaling distances by a factor a. 

For (r, /i) a continuum tree, we let C/,Cj^,... be the connected compo- 
nents of the open set {x £ t ■.d{x,p{T)) > t}, ranked so that p{Cl) > p{Cf) > 
■ ■ ■ . We then let <tJ be the element of r at height t such that CI C T^i . Then 

tI = ClU {al} is a compact R-tree which we root at uj. Notice that Tj is 
equal to r^i unless al £ B{t). 

Definition 5. A self-similar tree with index —7 < is a continuum ran- 
dom tree (T, p) such that for every t > 0, given {fi{T^),i > 1), the continuum 
random trees 



MT,') ;'V"" '• M^' 
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are i.i.d. copies of {T,fj,). 

Again, the last sentence means that there exist i.i.d. copies of a rep- 
resentative of (T,/i) embedded in h, whose isometry classes are those of 

(Mr/)-^r/,M-n7;i)MT/)),.... 

As was shown in [29], Theorem 1, Proposition 1, (laws of) self-similar 
continuum random trees with index —7 < are in one-to-one correspondence 
with (laws of) self-similar fragmentation processes with index —7, no erosion 
and no sudden loss of mass. We briefly describe how the two objects are 
related. 

Proposition 6. Let (—7,1^) be a characteristic pair with 7 > 0. There 
then exists a (unique) self-similar CRT {T,n) with index —7 such that the 
following holds. Given (T,fi), let Li,L2,- ■ ■ be an exchangeable sequence with 
directing law /i. For every t>0, we let n(t) be the random element ofV such 
that i and j are in the same block of n(t) if and only if d{p{T),Li f\Lj) > t, 
that is, if and only if Li and Lj belong to the same element of {T^ ,T^,. . .} . 
Then: 

(i) the process (n(t),t > 0) is a V-valued self- similar fragmentation with 
characteristic pair (—7,;/) and the process {{fj,{T^),i > l),t > 0) is equal to 
the process (|n(i)|''-,t > 0), that is, it is an S^ -valued fragmentation process 
with characteristic pair (—7,1/); 

(ii) if% denotes the element of {Tf^,T^^, . . .} that contains Li, then the 
process {^{T^ ),t > 0) is equal to (|n(j)(t)|,t > 0); 

(iii) the reduced tree R{T, Li, . . . , Lk) is equal to the tree with edge lengths 
TZ[k], as defined in Section 3.2. 

Proof, (i) It was shown in [29] that there is a unique tree {T,fj,) such 
that {{fi{T^),i > l),t> 0) is the ^^--valued fragmentation process with char- 
acteristic pair (—7, 1'). The fact that {^{T^),i > 1) = |n(t)|-'' for every t comes 

from the fact that fi is a.s. the limit of the empirical measure on Li, L2, 

It is easy to show that this process is right-continuous and that (n(t),t > 0) 
is a cadlag "P-valued process, and it follows that (n(t),t > 0) is the "P-valued 
fragmentation process with characteristic pair (—7, i^). (ii) is immediate from 



the fact that i and j are in the same block of n(t) if and only if Lj is in 7^ 



(i) 



and the fact that fi is a.s. the limit of the empirical measure on the leaves 
Li, . . . ,Ln as n ^ 00. Finally, (iii) is [29], Lemma 4. D 

4. Asymptotics of discrete fragmentation trees. We now embark on the 
proof of Theorem 2. We can obtain a weaker statement of convergence in 
distribution by using Aldous' theorems [4], Theorem 18, Corollary 19 and 
Remark 4. With a little more effort, we establish the stronger statement of 



20 HAAS, MIERMONT, PITMAN AND WINKEL 

Theorem 2 that, in fact, there exists a fragmentation tree defined on the 
given probability space, to which the discrete fragmentation trees converge 
in probability. 

It will often be convenient to initially assume the following. 

Hypothesis (H). Assume that we are given a probability space sup- 
porting {T,fi), a fragmentation tree associated with a self-similar fragmen- 
tation with characteristic pair (—7,^,1/), where i' satisfies the assumptions 
(6), (7). For simplicity, we let 7 = 7j/. We assume that our probability space 
also supports Li,L2, ■ ■ ■ , an exchangeable sample of leaves with directing 
measure 11. We let 7?.^ = R(T, Li, . . . , L^) and define a self-similar "P-valued 
fragmentation process (n(t),i > 0) with index —7 by the device explained 
in Proposition 6. Also, we let {Il^{t),t > 0) be the homogeneous fragmenta- 
tion process obtained from (n(t),t > 0) by the time-change transformation 
of Lemma 5. We denote by ^(t) = — log |n9j^x(t)|,i > the pure-jump subor- 
dinator with Levy measure (18) associated by Lemma 5. 

We let Tn be the discrete fragmentation tree with n leaves associated 
with (n(t),i > 0) [or (n°(t),i > 0)], as in Section 2. The tree r„ is then 
considered as an M-tree by assuming that its edges are segments with length 
1, in accordance with the discussion of Section 3.3.2. 

To see that Theorem 2 remains true without Hypothesis (H), simply note 
that a strongly consistent sequence (T°) has the same distribution (as a 
sequence of random variables) as if it were constructed under Hypothesis 
(H). Since convergence in probability for random variables with values in a 
complete space can be metrized by a complete distance (see [15], Theorem 
9.2.3), we deduce that {T°) is a Cauchy sequence for this distance, and thus 
converges in probability, because the set of compact real trees endowed with 
the Gromov-Hausdorff distance is complete. The distribution of the limit is 
identified as that of the fragmentation tree T. 

We recall that all trees involved in Aldous' study [4] are subspaces of 
/i, endowed with the ^i-distance, and that convergence of (compact) trees 
holds with respect to the Hausdorff distance. Using ^i-representatives of 
the trees Tn and T (which is always possible; see [29]) and applying Al- 
dous' asymptotic results will then lead us to the claimed convergence in the 
Gromov-Hausdorff sense. More precisely, using Theorem 18, Corollary 19 
and Remark 4 in [4] , we see that the proof of the convergence in distribution 
for the Gromov-Hausdorff topology, 

Tn {d) „. .^ 

^r(i-7)T, 



amounts to the following: 
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(p) 
(i) the leaf-tightness of {Tlk,k > 1), that is, m.m2<j<k d{Li, Lj) ^ as 

k -^ c«; 

(ii) the (a.s.) compactness of T; 

(iii) the convergence of "finite-dimensional marginals" ; 

(iv) a tightness criterion, which is stated precisely in Proposition 9 below. 

We will obtain the stronger convergence in probability under Hypothesis 
(H) by using the particular coupling of the discrete and continuum fragmen- 
tation trees, and establishing an almost sure convergence result in (iii). The 
tightness estimate of Aldous then provides the uniform bound that is needed 
to extend convergence of finite-dimensional marginals to Gromov-Hausdorff 
convergence, at the price of turning the a.s. convergence into convergence in 
probability. 

The two first points are proved in [29], Lemmas 3 and 5. The aim of this 
section is therefore to prove the latter two: Section 4.1 is devoted to the 
convergence of finite-dimensional marginals and Section 4.2 to the tightness 
estimate. Section 4.2 also contains the proof of Theorem 2. Finally, Section 
4.3 provides an analog of Theorem 2 for convergence of leaf-height functions. 

4.1. Convergence of finite- dimensional marginals. The first step is given 
by the following proposition, which contains the convergence of "finite di- 
mensional marginals" for Theorem 2, but note that we do not need the 
integrability condition (7). 

Proposition 7. Let v he a conservative dislocation measure satisfying 
the regular variation condition (6), (r„)n>i <m associated strongly sampling 
consistent family of discrete fragmentation trees defined on any probability 
space. Then the same probability space also supports TZ^ so that 

n-^i{n)-'R{Tn, [k]) ^ T{1 - 7)7^fc, 

n — >oo 

in the Gromov-Hausdorff sense, for all k>l. 

We observe that the convergence is in the sense of the Gromov-Hausdorff 
metric, but in the context of trees with edge lengths, there is a simple suffi- 
cient condition: finite trees with edge lengths ■(?„ converge to another finite 
tree with edge lengths i? if the shape of t9„ is eventually that of "d and the 
edge lengths converge pointwise. This condition is almost necessary, but 
there is a complication when some edge lengths converge to zero and shapes 
oscillate — this will be irrelevant here. 

A key ingredient is provided by the following lemma. 

Lemma 8 ([26]). Let ^ = {(,t,t > 0) be a pure-jump subordinator with 
Levy measure A satisfying 

(20) A{[x,oo))=x-''l{l/x), xjO. 
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Let Vi,V2,--- be a sequence of nonnegative random variables which condi- 
tionally given ^ are independent and identically distributed with 

F{Vi>v) = exp{-^„), v>0, 

and for s > 0, let 

Kn{s):=i^{Vi.:l<i<n,V,<s}, 

which is the number of distinct values among the Vi'.l <i <n with Vi < s. 
Then 



(21) lim sup 

"^°°0<s<oo 



^-r(i-.)/%xp(-.«* 



a.s. 



and hence for every random variable S with values in [0,oo], 

(22) lim ^iL^ = r(l-7)/ exp(-7e„)d^ a.s. 
n^oo n^t[n) Jo 

Proof. The joint distribution of the two processes {Kn{s),s > 0) and 
{^t,t > 0) is the same as if Vi,V2,... were more specifically of the form 
Vi = mi{v > : e"^" < Ui}, where Ui,U2,. . . is a sequence of independent 
uniform (0,1) variables independent of ^. Then Kn{s) is, as in [26], the 
minimal number of open intervals of the form (exp(— ,^„),exp(— ^^_)), v <s, 
containing Ui, 1 <i <n [with Ui > exp(— ^s)]- 

If P(«S' = s) = 1 for some fixed s G [0, oo], then the conclusion (22) is read 
from [26], Theorem 4.1, as indicated [26], Corollary 5.2, in the case s = oo. 
The uniform convergence (21) follows by a standard pathwise argument, 
using the facts that the process {Kn{s),s > 0) is increasing in s for each 
n and that the limit process (Jq exp(—^^i,) dv, s > 0) has continuous paths. 
D 

Proof of Proposition 7. Recall that the discrete tree Tn is also con- 
sidered as an M-tree by letting its edge lengths all be 1, so we may consider 
reduced trees of the form -R(T„, xi, . . . , Xk) where xi, . . . ,Xk are vertices of 
Tn. We let R{Tn,B) be the reduced tree of T„ spanned by the root and the 
vertices labeled by B. By exchangeability of the partition-valued process 
(n(t),t > 0), it is plain that R{Tn,L^, . . . jL"^) has same law as R{Tn,B) 
for any B with ^B = k and n > uiaxB. We are going to show that almost 
surely, for every finite B C'N, 

(23) n-^£{nr^R{Tn,B) ^:^ T{1 - j)R{T,{L,,i e B}). 



n—foo 



Notice that the shape of R{Tn,B) is exactly Tg, as in Definition 3, although 
the edge lengths are different from 1 in general. 
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Now, assume Hypothesis (H). Consider the case k = 1. By exchangeabihty, 
it is enough to discuss B = {1}. Let D^ be the (combinatorial) distance 
between the root of r„ and {1}. By construction of T„, we see that D" — 1 
is the number of fragmentations that the block of (n'^|n(^),^ > 0) containing 
1 undergoes from [n] to {1}. Similarly, 

(24) Z?^ - 1 = #{Li A L„ 2 < i < n} = #{d{p{T),Li ALi),2<i<n}, 

which is the number of branch points of R{T, Li, . . . , Ln) located on [[/9(T), 
Li]]. Conditionally given {T,p) and Li, the random variables d{p{T),Li A 
Li), i > 2, are independent and identically distributed with 

F{d{piT),LiAL,) > t\T,p,L,)=piT,^'^) = \U^,){t)\ = |nO,)(7?(i)(t))|, 

where, according to (ii) in Proposition 6 and Lemma 5, the process ry^) is 
the inverse of the process 



r?(-;:t^|jno,)(.)rd.. 



Moreover, ^ := (— log |nQN(t)|,t > 0) is a pure-jump subordinator with Levy 
measure defined in (18). Since the time-change r]n\ is continuous and strictly 
increasing, we also see that 

(25) D'^-l = #{V^,2<i<n}, where F, = ??(i)(d(p(r),Li A L,)) 

so that, conditionally given {T,p) and Li, the Vi for i>2 are independent 
and identically distributed with 

The desired conclusion that 

POO 

n-^i{n)-^D^ ^ r(l - 7) / exp(-76) ds = rjzUoo) = D^ 

is now read from (22) with 5 = oo. 

Next, assume that (23) holds for every B with ^B = k. We show that 
it then holds for ^B = k + 1. Again by exchangeability, it is enough to 
discuss the case B = [k + 1]. Let Dl^_^^l■l be the first time t when [k + 1] is 
not included in a block of n(i) so that -Drfc+ii is, by definition, the length 
of the edge adjacent to the root in R{T,Li,. . . ,Lk+i), that is, the height 
of Li A L2 A • • • A Lfc+i- Similarly, we let -DR,_,_]^i be the analogous time, but 
for the process {Il^{t),t > 0). By the time-change correspondence between 
n and n*^ (Lemma 5), if ^t = —log |^^^(t)|,i > 0, we know that 
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Let -Dn,^-^, be the height of the first branch point in i?(T„, {1}, . . . ,{k + 
1}). Then DVj^.-^-, — 1 is the number of fragmentation events undergone by 
n|n(i), < t < Dif-^iy This is also the number of distinct branch points of 
R(T,Li, . . . ,Ln) belonging to R{T,Li, . . . ,Lfc+i) with height < Dj^+i], that 
is, 

^ffe+i] - 1 = #{Li ALi,k + 2<i<n, dip{T),Li A L,) < D^k+i]} 

(one could use any Lj, j < k + 1, instead of Li in this formula). By the same 
argument used for k = I, 

l?P,+i] - 1 = #{V^, k + 2<i<n,V,< Z?f,+i]} 

where Vi = r/(i)((i(p(T),Li A Lj)), 

as before. Formula (22) applied with S = D9j_,,-^, now yields 

(26) n-Mn)-iDp,+i]^-^r(l-7) T''^^-^^^ ds = r(l - 7)%+!], 

so the renormalized length of the root-edge of i?(r„, {!}, . . . ,{k + 1}) con- 
verges to the length of the root-edge of R{T, Li, . . . , Lk+i), up to the renor- 
malization factor r(l — 7). 

Next, let vr = n°|fc_|_i(Z)R^-^, ), with nonempty blocks vri, . . . ,7rr. Recalling 
the notation of Sections 3.2 and 3.3.2, we have 

RiTn, [k + 1]) = {R{Tn,7n) - Df,^,^,. . .,R{Tn,7rr) - D]},^,]) ^r. 

[k + l\ 

because Dn^,^-, is the height of the first branch point of i?(T„, [/c -|- 1]), while 
TTj C [A; -|- 1] . For the same reason, 

R{T,{Li,...,Lk+i}) 

= {R{T: {Li,i€ TTi}) - Dii,^i] , . . . , R{T, {Li,ie TrJ) - D^k+i] )d|^^^, • 

Now, condition on the first split vr. The conclusion follows from (26) and the 
induction hypothesis, which implies that for 1 <i <r, 

n-^e{n)-^R{Tn,7ri) ^^ r{l --f)R{T,{L„j Gin}). 



n— >oo 



This completes the proof under Hypothesis (H). Note, however, that the 
joint distribution of (r„)„>i as a sequence of 0-valued random variables is 
the same under Hypothesis (H) as in the apparently more general setting of 
Proposition 7. Since G is complete, we conclude that also in the setting of 
Proposition 7, there exists a tree 7^^ on the given probability space to which 
the rescaled R(Tn, [k]) converge a.s. D 
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4.2. Tightness estimate. The aim of this subsection is to prove the forth- 
coming tightness estimate (Proposition 9). 

Proposition 9. For k<n, let 

A(n,/c):= maxd„({i},i?(r„,{l},...,{fc})), 

l<j<n 

dn being the metric associated with the tree Tn- Then, under the hypotheses 
of Theorem 2, for each r^ > 0, 

, , fA(n,k) \ 

hm hmsupr _ > r/ = 0. 

fc^oo n^oo \A{n '-) ) 

Before we give the proof of this proposition, let us deduce Theorem 2. 

Proof of Theorem 2. First, assume Hypothesis (H). Fix e,ry > and 
choose k large enough that P(r(l — j)dG}i{T^k,'^) > rj) < e (we know from 
[29] that /i-representatives of 7?.^ converge to T a.s. as k ^ oo; Hausdorff 
convergence in li implies Gromov-Hausdorff convergence) and 

limsupP(dGH(i?(T„, {!}, . . . , {k}),Tn) > A(?^~')??) < e 

71— ♦OO 

(such k exists by Proposition 9). Then for n sufficiently large, 

P(dGH(i?(T„, {1}, . . . , {k}),Tn) > A(n-i)77) < e 

and also 

¥{dGH{R{Tn, {1}, . . . , {k})/A{n-^),T{l - 7)7^fc) > t?) < e 

since R{Tn,{l},. . . ,{k})/A{n^^) converges a.s. to r(l — 7)7?.^ as ?i ^ oo 
(see Proposition 7). Hence, for n sufficiently large, F{dG}i{Tn/A{n~^),r{l — 
7)T) > 3r]) < 3e. This completes the proof for the setting of this section, 
where T„,, n > 1, are derived from an exchangeable sample of leaves Li, L2, . . . 
with directing measure // of a given CRT (T, /x). If we do not assume (H), 
then we argue, as at the end of the proof of Proposition 7, that for any prob- 
ability space supporting {Tn,n > 1), there exists a random M-tree T'('y^,u) on 
the same probability space, to which the rescaled T„ converge in probability. 
D 

The proof of Proposition 9 which we postponed is given in Section 4.2.2, 
Section 4.2.1 being devoted to the proof of key intermediate results (Lemma 
10 and its Corollary 11). We will work under Hypothesis (H), without loss 
of generality. 
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4.2.1. A key lemma. Throughout, we consider a fixed v. Imphcitly, the 
constants appearing in this section may depend on v. Note that the condi- 
tions (6) and (7) satisfied by z/ imply that the tail A of the Levy measure A 
[defined in (18)], that is, A(x) = /^ A((iy), x > 0, satisfies both the regular 
variation condition A{x) ~ x~'^£{l/x) as x ^ and J°^ xPA{dx) < oo. We 
may, and will, also assume, without loss of generality, that < p < 7. We 
claim that this implies the existence of some finite constant Ca > such 
that 

(27) A{xy) < Cji{x)y^P for all y > 1, < x < 1. 

To see this, choose 5 <^ — p and note that Potter's theorem ([11], Theorem 
1.5.6) implies the existence of some X >Q such that for x G (0, 1], y > 1 with 
xy < X, we have 

A(xy)/A(x) < 2/-^ < ly-P. 

On the other hand, if x G (0, 1] and xy > X , we have 

A{xy) < {xy)~f / zPK{dz), 
Jx 

where the last integral is finite, while x"f < C^A(x) for some constant C'^ > 
because x~p/A{x) is regularly varying with exponent 7 — p > at and 
is hence bounded on (0, 1] . The estimate (27) will be useful in the sequel. 

Let Hn be the height of the tree r„, that is, Hn := maxi<j<„Z)", where 
Z^f denotes the height of the leaf {i} (i.e., its distance to the root) in the 
tree T^. 

Lemma 10. There exists a random variable Xoo, with positive moments 
of all orders, such that, for all p > 2/7, there exists a constant Cp such that, 
for all x>l and all integers n, 

F{Hn > (1 + x)2XooA(n-i)) < ^. 

Corollary 11. For all a> and p > 2/7, there exists some constant 
Cp^a such that, for all x>l and all integers n, 

FiHn>axA{n~'))<^. 

Proof. We simply use the fact that 

F{Hn > axA(n"^)) < P(iJ„ > axA(n"^), ax > (1 + ^/x)2Xoo) 
+ P((l + V^)2Xoo>ax), 
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then bound the right-hand side from above side using the upper bound of 
Lemma 10 for the first probabihty and the fact that £'[X^] is finite for the 
second probabihty. D 

The main idea needed to prove Lemma 10 is to transfer the problem on 
the tail of Hn onto a problem on the tail of -Df , using H^ = maxi<j<„ D". 
Indeed, for all (random) sequences (Xj)j>i such that the random variables 
(-D",Xj), 1 <i <n, are identically distributed, one has 

F{Hn > Xoox) < nF{D[' > Xix) \fx > 0, 

where X^o :=supj>]^Xj. Therefore, it is sufficient to find random variables 
Xi, i > 1, whose supremum possesses moments of all positive orders and 
then a suitable upper bound for the tail of D^ to conclude. This is the goal 
of the remainder of this subsection. Define Xi by 

oo 

X, := (1 + A^)Ca Y, exp{-pa) + 1, 
fc=0 

where 



since 7 G (0,1), and ^* is the subordinator describing the evolution of the 
sizes of the blocks containing i in the fragmentation 11^, as explained in 
Lemma 5. Clearly, {D^,Xi), 1 <i<n, are identically distributed (by ex- 
changeability) and 



where 



Xoo = supXi < (1 + A^)Ca{1 + Cp) + 1, 



Cp:= sup/ exp{-pC^)dt 

i>l Jo 



is (in distribution) the first time at which a self-similar fragmentation with 
parameters (— p, z^) reaches the trivial partition {{1},{2}, . . .} (in others 
words, it is the height of the associated fragmentation tree). It was proven in 
[28] (Proposition 14) that ^p (hence Xqo) has exponential moments. Lemma 
10 is therefore an immediate consequence of the following result. 

Lemma 12. For all p > 0, there exists a constant C' such that for all 
x>l and all integers n, 

P(D^ > (1 + x)2Xi'K{n-^)) < — ^^. 
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The remainder of this subsection is devoted to the proof of this lemma. 
To simphfy the notation, we omit the index 1 wherever we can (i.e., ^ now 
stands for ^^, X for Xi). We also set D^ := D"^ — 1 for the number of 
internal vertices between the root and leaf {1} in r„+i. Since Z?""*" < 2Dn 
and A(n~^) < A((n + 1)~^), the upper bound stated in Lemma 12 is a con- 
sequence of the existence of some constant C' such that for all x > 1 and 
all integers n, 

(28) F{D., >(1 + x)XA(n-i)) < -^^. 

To prove this latter inequality, we proceed in three steps. 

Let Nx{s, t) denote the number of jumps of ^ of size at least x in the time 
interval [s,i], Nx{s,t) denote the number of jumps of 1 — exp(— ^) of size at 
least X in the same time interval and Nx '■= Nx{0,oo). 

Step 1. Large deviations for Nx- The regular variation of A at ensures 
that Nx ~ A{x)D a.s. as x ^ 0, where D = J^ exp{—'yS,t) dt (Theorem 5.1, 
[26] ) . The goal of this first step is to give some kind of large deviations result 
on this convergence. 

Lemma 13. For all x>0 and < y < 1, 

FiNy > (l + x)C7Af;(exp(-peO)A(y) j < exp(-a,.A(y)), 

where Ux := (1 + x) ln(l + x) — x > 0. 

Proof. Let J-'t denote the a-field generated by ^ until time t and J^ the 
one generated by ^, and observe that 

oo oo 

i=0 i=0 

Conditional on J^j, Ny^^pf^A (i, i + 1) is a Poisson random variable with mean 
A(yexp(^j)). But for any Poisson random variables P with mean A, one has 

E[exp(tP - (1 + x)tX)] = exp((exp(t) - 1 - (1 + x)t)X) Vt G M. 

In particular, when t = ln(l + x), exp(i) — 1 — (1 + x)t = —ax < and the 
expectation is smaller than 1. Hence, for all n G N, using (27) for the first 
inequality, we get, for all y < 1, 

(n n \ 

^ 7V,,,p(^^)(i, i + 1) > (1 + x)Ca E exp(-peOA(y) 
i=0 i=0 / 

Cn n \ 

Y,Ny,^p{i^){i,i + i)>{i + x) J2My^M^i))] 
4=0 i=0 / 
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<E 



M-l 



exp ( t ( ^ • • • J J E[exp(t(iVyexp{5„)(n, n + l) 



(l + x)A(yexp(e„))))|^„] 



<---<exp(-aa.A(y)), 



the last line being obtained by induction: at each step but the last, we use 
the upper bound 1 for the (conditional) expectation and for the last step, we 
use the upper bound exp{—axA{y)) for the expectation E[exp(t(A^y(0, 1) — 
(1 + x)A{y)))]. It remains to let n ^ oo in the first probability involved in 
the above sequence of inequalities and to use Fatou's lemma. D 

Step 2. Large deviations for E[Z)„|^]. We now establish a result similar to 
the required inequality (28), but for the quantity E[L'„|^], where T = Too 
is the cr-field generated by the whole subordinator ^ [recall that we work 
under Hypothesis (H)]. 

Lemma 14. Let B^:=YX=\^'^v{-^~^axk^l'^) with ax = 2\^2-\. Then 
for all x>l and all integers n large enough, 

F{E[Dn\J^] >{l + x){X- l)A(n^^)) < (1 + B^) exp(-4-^aixA(n^^)). 
Proof. According to the formula (4) of [26], 

E[D„| J-] =n (1 - vT-'Ny dy < iVi/„ + n Ny dy. 

Jo Jo 

Hence, setting S := Ca Ei^oexp(-p^i), 

F{E[Dn\J^] > (1 + x)(l + A^)SA{n-^)) 
<P(iVi/„>(l + x)5A(n-i)) 



+ 1 



l/n _ _ 

n I Nydy> {1 + x)A^SK{n~'^' 





The first probability in the right-hand side is smaller than exp(— aa;A(n ^)), 
according to Lemma 13. To bound the second probability, we use n /^ /(fc+i)^ Ny dy < 
Ni/(n[k+l))kg^y which gives 

nl/n 



n 



Nydy> A^{l + x)SA{n~ 
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oo 

< J2 nNi/nik+1) > 2(fc + 1)^(1 + x)SA(n"i)). 

k=l 

Since A is regularly varying at with index —7, we have, provided n is large 
enough, that A{n-^){k + l)^/^ < 2A(((fc + l)n)-^) and A(((A; + l)n)-^) < 
2A{n~^){k + l)v^ for all k> 1 (to see this, use, e.g.. Potter's theorem. The- 
orem 1.5.6, [11]). Combined with Lemma 13, this implies that the above 
sum of probabilities is smaller than 

00 00 

^ exp(-a,.A(((A; + l)n)-^)) < ^ expi-2-'a,Ain~^)ik + l)^/^). 

fc=i fe=i 

Last, the exponential in the latter sum can be split in two, using (k + l)'^''^ > 
2~^{k'^''^ + 1)1 to get the upper bound 

00 
exp(-4-io^A(n-i)) ^ exp{-a,,4:~^A{n-^)k^/^), 

k=l 

which is smaller than ex.p{—A~^aixA{n~^))B^ for all x > 1 {ax > aix for 
x>l) and n large enough. D 

Step 3. Proof of inequality (28). To start with, fix x > 1, n G N, and note 
that 

F{Dn > (1 + x)XA(n^i)) < P(E[i^„|.F] > (1 + x){X - l)A{n~^)) 
(29) 

+ ¥{Dn - E[Dn\J^] > (1 + x)A(n-i)). 

Lemma 14 gives an upper bound for the first probability, provided n is large 
enough. To get an upper bound for the second probability, we use a result 
on urn models (Devroye [12], Section 6) which ensures that 

P(«„ - ElD.m > .l^) < exp(-^j;jj;-i|^) V, > 0, „ € N, 

This implies that for all tti > 1, there exists some deterministic constant B^ 
depending only on m such that 

F{Dn - E[A^|^] > (1 + x)A(n-i)|.F) 

'E[D„|-F] + (l + x)A(n-i^^'" 



< Bm 



((l + x)A(n-i))2 



((l + x)A(n-i))2'^ 

< 2^-1^3 iE[g;ri-^] + (a+^)A(n-i))"'' 

((l + x)A(n-i))2™ 
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the last line being obtained by Jensen's inequality. We then take expectations 
on both sides of the resulting inequality. Theorem 6.3 of [26] ensures that 
E[L'™] ~ (A(?i~^))'" (up to a constant). Therefore, we have 

(30) nOn - nDn\:F] > (1 + x)A(n"i)) < S„,a((1 + x)A(n-i))-'", 

where -Bm,A depends only on m and A. 

Next, recall the upper bound given by Lemma 14 for the first probability 
involved in the right-hand side of (29). Together with the upper bound (30), 
it leads to the existence of B'^ ^ such that 

¥{Dn > (1 + x)XA(n-i)) < i?:„,AX-™(A(n-i))-'" 

for all x > 1 and n large enough, say n > uq. Since A(n^^) ~ rCl{n) when 
n -^ oo, this upper bound is, in turn, bounded from above by x~'^n^~'^"^ , 
up to some constant, which is the required result (28). 

Finally, inequality (28) is also true when n < ng (for all x > 1) since D^ < 
n < riQ and X >l, and therefore the probability P(-Dn > (1 + x)XA{n~^)) 
is null whenever 1 + x > ?7-o(A(l))~^. 

4.2.2. Proof of Proposition 9. The crucial point is that 

A(n, A;) = maxf/^ k.n, 

where the n •'" and H^k,n, 1< k <n, j >1, are defined as follows. Let n(j)(i) 
denote the block of n(t) containing i, i>l. Then for all fc > 1, introduce 

tf:=inf{t>O:n(,)(t)n[A:]=0}, 

the first time at which the fragment containing i is disjoint from [k] (in 
particular, t^=oo for 1 < i < k). For all t > 0, the collection of blocks 
(n(j)(tf + t),i> k + 1) induces a partition, denoted n(t'' + t), of N \ [k] 
and each Hj{t + 1) admits asymptotic frequencies, as n(t + t) is an ex- 
changeable partition of N\[A:]. We call n'" the cardinality of Ilj{t^) n [n] 

and A^ the a.s. limit of n^'^/n as n — > oo. Clearly, X^^^^ := maxj>i A^ — > 
a.s. as /c — > oo. 

Then let Q{k) be the u-field generated by n(t'^). In the terminology of 
Bertoin ([10], Definition 3.4), the sequence {tf,i G N) is a stopping line and, 
as such, satisfies the extended branching property ([10], Lemma 3.14) which 
ensures that given G{k), the process (Jl{t^ + t),t > 0) is a fragmentation 
process starting from n(t ). This implies that given G{k), the discrete frag- 
mentation trees, with respectively n^'",n2'"', . . . leaves, associated with the 
fragmentations of the blocks Iij{t ), j > 1, evolve independently as n — > oo 
with laws respectively distributed as T k.n, j > 1. In particular, given Q{k), 
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the respective heights of those trees, H k,n, j > 1, are independent and dis- 
tributed as H k,n, 7 > 1. 

Let r/ > 0. We now turn back to our goal, which is to prove that 

hm hminfP(A(?i,A;) < 7/A(n"^)) = 1. 

fc^oo n— >oo 

Note that first applying dominated convergence (for the limit in /c, every- 
thing is bounded by 1) and then Fatou's lemma (for the liminf in n), it is 
sufficient to show that 

lim liminf P(A(n, k) < 7]A(n~'^)\g (k)) -^ 1 a.s. 

fc^oo n—>oo 

According to the discussion above, 

P(A(n, k) < 7]A{n^^)\g{k)) = Y[ ¥(H k.n < 7]A{n~^)\g{k)) 

i>i "^ 
and our goal turns into the proof of 

lim lim^inf ^ln(l - F(H^k,n > riA{n-^)\g{k))) = 0. 
^°° j>i ■' 

For the rest of the argument, we may consider that n'"", A^, j > 1, are 
deterministic and drop the conditioning on g{k) from the notation. Let 
p > max(/9~^,2/7). By inequality (27), for all j, k,n>l such that n •'" / 0, 

GAA(n-)>(^)'A((n,^-)-). 



n^'^\PP 



Corollary 11 then ensures that 

where Cp^\^ri-, depends only on p, A and 77, for all i,k,n> 1, with the con- 
vention Hq := 0. 

In the rest of the proof, we choose k large enough, say k > kg, so that 
A^ax < (2(2Cp,A,;y)^^^'')~^- Then consider some integer jk such that J2j>j^: ^] < 
An^ax- Since Ua''^ jn — > A^ as n — > 00 for all j > 1 and also J2j>jt, nj'^/n -^ 
J2j>j^. ^j: there exists an integer n^ such that for all n>nk, n,'"^ /n < 2A^, 

1 < J < jfc, and Ej>jfe ny"/n < 2\^^^. In particular, n^-'^/n < 2\'^^^ for all 
j > 1. Consequently, using the fact that | ln(l — x)| <2x when < x < 1/2, 
we have for all n>nk, 

^ I ln(l - P(i/„.,„ > r7A(n-i)))| < 2Cp,A,, E H^ 
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ik (\k\pp , (n\k \pp\ 



The parenthesis in the upper bound converges to {J^iLii^iT^ + (2A^ 



maxy 



as n ^ oo, which is smaller than (A^^x)^'' ^{^ + 2^'') (since J^lLi ^i < !)• 
The result follows since A^^x ^ as /c ^ cxo . 

4.3. Height functions. The aim of this section is to provide an analog of 
Theorem 2 for a family of functions coding the heights of leaves in ordered 
versions of the trees. In the special case of beta-splitting models, this con- 
vergence of leaf-height functions was suggested, but not proven, by Aldous 

The ordered version of T„ is obtained by putting the set of children of 
every nonleaf vertex of T„ in exchangeable random order, independently over 
distinct vertices and conditionally on Tn- This is usually achieved by taking 
(rooted) planar embeddings of the trees, where the order among children of 
a vertex is read from the clockwise ordering of edges going from the vertex 
to its children. We then define the order ^„ as a linear order on the leaves 
{!}, . . . , {n} of Tn by saying that {i} ^„ {j} if the subtree pending from the 
most recent common ancestor {i} A {j} of {i} and {j} that contains {i} 
comes before the subtree pending from {i} A {j} that contains {j}. 

If {Tn,n > 1) is a strongly consistent family of trees, we also want the 
orders {'^n^'n ^ 1) to satisfy a consistency property, namely, the restriction 
of ^n+i to {!}, {2}, . . . , {n} is -^n- With our interpretation of ordered trees as 
planar embeddings, this means that the embeddings are drawn consistently. 
This can be achieved inductively as follows, starting from ^i, the trivial 
order on {{1}}. Suppose we are given Tn+i and ^„. Denote by h{{n + 1}) 
the father of {n + 1} in T„-|_i. For any nonleaf vertex v of T„+i distinct from 
5({n -|- 1}), the children of v are ordered in the same way for T„,, <„,. Hence, 
the restriction to {!}, . . . , {n} of ^n+i must be ^„,. 

Next, two possibilities occur: either h{{n + 1}) was already a vertex of T„ 
or h{{n + 1}) is a newly added vertex in Tn+i with two offspring. 

• If6({n-|-l})isa vertex of r„ with r children ordered as ci, . . . , c^, we let 
{n -|- 1} be the jth son of 6({n -|- 1}) in T„_|_i, 1 < j < r -|- 1, with equal 
probability l/(r -|- 1), and the order of the other children is preserved. 

• Otherwise, h{{n + 1}) must have a unique son c besides {n -|- 1} in T„+i 
and we let {n + 1} be placed before or after c with equal probability 1/2. 

Note that ^n naturally extends to a linear order on T„ by letting v ^nW 
if either v is an ancestor of w or v Aw = {i} A {j} for some leaves {«}, {j} 
such that {i} ^„ {j}. This corresponds to the usual depth-first search order 
for rooted planar trees. 
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For each n > 1, we associate with the ordered tree T™ = (T„,^„) its 
leaf-height function /i„, defined on [0,1] by hn{0) := 0, hn{l) := and, for 
1 < i < n. 



hn{ ) := height of the zth leaf (in the left-to-right ordering), 

Vn-I- 1/ 

with linear interpolation. In general, the leaf-height function does not encode 
the full shape of the discrete tree and, more precisely, leaves some ambiguity 
where there are multiple branch points (e.g., the two possible unlabeled 
ordered trees with five leaves, all at distance 3 from the root vertex, are not 
distinguished by the leaf- height process). 

Similarly, but fully encoding, a continuous height function h : [0, 1] — > M+, 
/i(0) = /i(l) = 0, can be associated with the limiting fragmentation tree T. 
Roughly, the construction of h proceeds as follows (for details, we refer to 
Theorem 3 and Section 4.1 of [29], where it is more precisely proved that any 
fragmentation tree with an infinite dislocation measure — which is the case 
here — can be encoded into such continuous function). For each k, n such that 
k <n, let /^ € {1, . . . , n} be the position of the leaf {k} among the leaves of 
Tn, with respect to the left-to-right ordering ^„. Then define 

jn 

Uk ■■= lim — ^. 

n— >oo n + 1 

These limits exist a.s. and the U^, k>l, are i.i.d. uniformly distributed on 
[0, 1]. The height function h is then defined on {Uk,k> 1} hy h{Uk) :=height 
of {k} in T and its definition can be extended continuously to [0, 1]. The tree 
T can be recovered from h: it is isometric to the quotient space ([0, l],d)/ ~, 
where d{x,y) := h{x) + h{y) — 2inf2g[^.^y] h{z) and x ~ y 44> d{x,y) = 0. An 
order ^ on the leaves of T is then implicitly given by the natural order on 
[0, 1] : let x,y G [0, 1] ; if their images x, y by projection on the quotient space 
are leaves, then x <y ^x <y. Further, according to Theorem 4 of [29], the 
function h is a.s. Holder-continuous of any order 9 < 'j, but not of order 
> 7 when u integrates s^ . 

The a.s. convergence in Proposition 7 gives us a first connection between 
/i„ and h; namely, for all k, 

n-''i{n)~UhJ^-],...,hrJ " 



-, I 1 •••■)■ "11, \ . -, 

,n + 1/ \n + 1 

(31) 

r(i- 7) (/i(c/i ),..., /i(c/fc)) 



a.s. 



More precisely, the following holds. 

Theorem 15. In the situation of Theorem 2, 
hnjt) \ (p) 



(r(l-7)Mi))o<i<i 
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for the uniform norm on the space of continuous functions on [0, 1] . 

Proof. Let /i„ := hn/n'^i{n)T{l — 7) and note that the convergences 
(31) imply that the only possible uniform limit (in distribution) for subse- 
quences of hn is h. Similarly to the proof of Theorem 2, we can strengthen 

(31) into convergence in probability for the uniform norm, by using a certain 
uniform estimate. This is inspired by a tightness estimate used the proof of 
Aldous [4], Theorem 20 for convergence of contour functions. 

Fix k <n and consider the order statistics !?■[ ,l<i<^, oi IJ!,1 <i < k. 

Also let /J|j\ := 0, I?k+i) :="- + !• Then introduce 

u;°(7i„) := max sup \hn{t) -hn{I?A /{n + 1))\. 

°-*-^e[/(';f/(n+i),/(';:;:;j/(n+i)] 

Our goal is to prove that 

(32) lim limsupP('u;^(75:„)>r?) = Vr? > 0, 

fc— >oo n^oo 

which is the analog of formula (30) of Aldous [4], Theorem 20, with a = 
there. Following the last lines of the proof of Aldous, one sees that (32) 
implies the tightness of {hn,n > 1). 
To get (32), first note that 



wi(hn) < max 
0<«<fc 



max hn{t) — min hn{t) 

te[/(",f/(n+l),/(7^\j/(n+l)] te[/^",.V(n+l),7j".;\j/{n+l)] 



<max|d.({Ca^n,{Cin"})l, 

0<t<k 

where dn is the metric associated with the real trees Tn / {n"^ i{n)T {1 — 7)) 
and {q^^} is the leaf of T„ that has the highest height among the leaves 
{j} of Tn such that {/^"^ } :<n {j} <n {-^('J+i)}, where, by convention, both 
{I7q\ } and {/J^' , j^-i} denote the root p{T) of T. Similarly, among these leaves, 
{^min'} i^ ^^^ °^^ ^^^* ^^^ t^^ lowest height. Then define v^'ax ™ '^k'-~ 

7^(^„,{l},...,{fc})by 

and define similarly f^'j^'*. Now, fix £,77 > 0. Proposition 9 ensures that for 
k large enough and then for n sufficiently large, 

P( max d.dCa'i*}, <'^^) >ri)<e 



and 



~j / r n,k,i^ n,k,i\ ^ \ ^ 
0<i<k 
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On the other hand, dn(?^max > VW*) <^n({-?'(") },{I{i+i)}) and, using Propo- 
sition 7, 

max^^„({/("^''}, {/(":J:i)}) -^ max^ d{L\f^,L\^^-^-^) a.s. as n ^ oo, 

where L^n ^ • • • ^ L/^^n denotes the ^-ordered sequence of leaves {1}, . . . , {/c} 
in T and L^^^ := L^^^,-^-. := p{T). Finally, it is not hard to check that the com- 
pactness of T and its ordered leaf- density (informally, this means that the 
leaves are dense with respect to the order :<; see [29], Section 4.1, for precise 
details) imply that maxo<i<fc(i(L/^^,L^ , -^J -^ a.s. as A; ^ oo. Therefore, 
for k large enough and then for n sufficiently large, 

¥{wl(hn) > 3r?) < 3e, 

hence (32). 

With this available, we just write 

sup \hn{t) - h{t)\ 

0<t<l 



< Wf.{hn) + max sup 

°-^-^G[/("^f/(n+l),/,";^;j/{n+l)] 



h{t) - h 



jn,k 

n+1 



+ max 

0<i<fc 



/ TIT- \ /I 



rn,k 



n + l 



where C/(j), 1 <i < k, are the order statistics oi Ui,. . . ,Uk- The desired con- 
vergence in probability is now a consequence of (31), (32), the fact that 
/^/(n -|- 1) converges to Uk a.s., and the a.s. continuity of h. 

It was implicit in this proof that we were working with a strongly consis- 
tent family of discrete trees built from a self-similar fragmentation contin- 
uum tree and our usual argument shows that it still holds for any strongly 
consistent family. D 

5. Beta-splitting, alpha and stable trees. 

5.1. Aldous^s beta- splitting models. Aldous [1] suggests a further study 
of what he calls beta- splitting models, where 



zif^Jo \k 



n 



T{(3 + k + l)r{p + n-k + l) 



Zi^nkj r(n + 2/? + 2) 
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1 < A; < n — 1, for some — 2 < /3 < oo. He says that these are sampling con- 
sistent and that he would like to establish continuum random tree limits 
(known only for (3 = —3/2, the Brownian CRT of Aldous [2]) also for all 
—2 < (3 < —1. He studies the asymptotic behavior of a randomly chosen leaf 
and heuristically argued that leaf-height functions rescaled in the same way 
should also converge. Our Theorem 2 and its height function ramification in 
Theorem 15 turn Aldous's heuristics into rigorous mathematics. 

It is clear from Aldous's work [1] that the beta-splitting model with — 2 < 
[5 < —1 corresponds to a binary dislocation measure 

l^Aldous-/3(si G dx) = Cpx'^{l - x)^l{i/2<:i.<l} dx 

and therefore satisfies the regular variation condition (6) with 7 = —(3 — 1 
and l{x) ~ C^/(— 1 — /?). Since the splitting rules do not depend on C^, we 
will choose Cp = (— /3 — l)/r(2 -|- /3) in the sequel. 

Note that the symmetrized binary splitting rule above naturally gives rise 
to rooted ordered (or planar) trees T°'''^ by the obvious recursive construction 
that builds tree T"'''^ from a left subtree with k leaves and a right subtree 
with n — k leaves, with probability q'„ ^"^~^(fc), 1 <k <n — 1. We can now 
enumerate leaves from left to right and record their heights 

/i„(i/(n+l)) = distance from the root of the ith leaf from left to right. 
(33) 

Also putting hn{0) = /in(l) = and continuously extending to [0, 1] by linear 
interpolation gives the leaf-height function (which, in the binary case, fully 
encodes the discrete tree, just as the limiting height function fully encodes 
the limiting CRT) referred to by Aldous [1]. 

Corollary 16. For a strongly sampling consistent family of trees T°, 
nyi, from the beta- splitting model with — 2</3<— 1, we have 

for the Gromov-Hausdorff metric. Furthermore, the associated rescaled leaf- 
height functions converge to an associated limiting height function (see Sec- 
tion 4.3) 

for the uniform norm. 
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5.2. Ford^s alpha models. There are several versions of the alpha model 
of random binary combinatorial trees, ordered and unordered, labeled and 
unlabeled, and each can be described in different ways; see Ford [21, 22]. We 
focus here on the induced distributions P^^'^'° on T™'^'", unlabeled shapes 
of planted (actually binary) plane (i.e., ordered) trees with n leaves. Ford's 
original sequential construction leads to an increasing sequence of random 
trees T°^'^'° ~ P°''*^'°, n > 1, and we shall use this notation throughout our 
alpha model discussion. Fix a € [0, 1]. 

The sequential construction starts with the unique planted binary un- 
labeled plane trees T™ '° and Tg'^ '° with one and two leaves, respectively. 
Given the random tree T° '° with n leaves constructed following these rules, 
the (n + l)st leaf is added as follows: choose an edge according to weights 
a on edges between two inner vertices and 1 — a on edges between a leaf 
and an inner vertex. Since there are n — 1 inner edges and n leaf edges, the 
normalization constant is n — a. Replace this edge between its two vertices 
by a new vertex and two edges linking its two vertices to the new vertex. 
Choose whether to attach the new leaf to the left or to the right of the new 
vertex with equal probability. The resulting random tree with n + 1 leaves 
IS caUed J„+i • 

We can now deduce the following corollary from Theorems 2 and 15. 

Corollary 17. Let T° be the unlabeled tree derived from Ford^s se- 
quential construction by forgetting the order of branches. Then 

for the Gromov-Hausdorff topology, where 

I'Ford-aisi ^ dx) 

= p/-^_^n (q(x(1 - x))"""^ + (2 - 4a)(x(l - a;))"°)l|i/2<^<i} dx. 

Furthermore, the associated rescaled leaf-height functions (33) encoding T°'° 
converge 

'hnit)\ (d) 



n /o<t<i 
for the uniform topology on continuous functions defined on [0, 1] . 

Proof. Ford [21] shows that (-P™'^'°)n>i are the distributions of a sam- 
pling consistent Markov branching model with splitting kernel 

l<k<n-l. 
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where Ta{n) = (n— 1 — a){n — 2 — a) • • • (2 — a)(l — a) = T{n — a)/T{l — a). 
For < a < 1, Ford [22] also indicates that as n ^ oo, for all < x < 1, 

n'+^Ui^n]) ~ ^.^^ (|(x(l - x))-"-^ + (1 - 2a)(x(l - x))"") 

= : fPord-aix)- 

In the light of (16), we associate the binary dislocation measure 

l^Ford-Q(si G dx) = ifFord-a{x) + fpord-ai'^ " 2;))l{i/2<a;<l} dx 
= 2/Ford-Q(2;)l{l/2<a:<l} dx. 

It is clear from Corollary 4 and the discussion which followed that the dislo- 
cation measure i^Pord-a induces Ford's splitting rule {qn)n>2- By application 

of Theorem 2, (34) holds with — > replaced by — > for T° instead of T°, 
where (r°)„>i is a strongly sampling consistent family derived from the 
homogeneous fragmentation with dislocation measure z^Ford-a- But, accord- 
ing to Ford [21], for each fixed n > 1, there is the identity in distribution 
T° ^T°. Theorem 15 can now be applied in the same way. D 

As remarked by Ford [21], gFord-o _ ^^ ous-/^ ^^ ^^^ ^^^y if q, = _^ _ i = 
1/2 (uniform model), a = (3 = Q (Yule model) or q = —fi — 1 | 1 (comb 
model). Also, we see that Ford's alpha model, as a model of exchangeable 
probability distributions on cladograms (by adding exchangeable leaf labels), 
is one of the wider class of Aldous's Markov branching models of type c = 0, 
v[dx) = f{x) dx in Corollary 4. 

Finally, we make some rather subtle points about Ford's sequential con- 
struction. It will be convenient to also consider T^ as the tree T° '° 
equipped with leaf labels in the order of Ford's sequential construction, and 
the unordered labeled tree Tn derived from T°'^'^. In the following list, we 
consider a S (0, 1) and also exclude a = 1/2, where no such subtleties arise. 

• If a uniform leaf of T^'^'° is deleted, the tree generated by the remaining 
leaves has the same distribution as T^_f. Nevertheless, for T™'^, with leaf 
labels in order of appearance, these labels are not exchangeable for n > 3. 
For example, in T™ , leaf 3 has height 2 if the edge of T™ chosen for the 
insertion of 3 is adjacent to the root with probability a/(2 — a) ^ 1/3. 

• For fixed n > 5, ih.e joint distribution of the unlabeled trees {T™'^'°)i<m<n 
is not the same as the joint distribution of {Tn '° )i<m<n, where Tn '° 
T™ '°, and Tn '° is obtained from Tn '°'^™' by deleting a uniform 
leaf, TTi = n, . . . , 2. Therefore, (T°)„>i is not strongly sampling consistent. 

• We showed in Proposition 7 that for a strongly sampling consistent family 
of trees, convergence of finite-dimensional marginals holds almost surely 
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with limiting trees TZk with edge lengths. In the next subsection, we will 
establish a corresponding result for {Tn)n>i- We also give a line-breaking 
construction of the almost sure limiting trees 7^^, A; > 1. 
• We conjecture that the completion of U^fc has the same distribution 
as T/„ ,,^ ^ 1 and that Tti, can be embedded in T/„ ,,^ ^ -, by suitable 
nonuniform, and presumably dependent, sampling of leaves. 

5.3. Limiting edge lengths in Ford^s sequential construction. Let < a < 
1. The limiting continuum random tree '^{a,upa^^_a) naturally contains its 
uniformly sampled subtrees TZk, ^ > 1, and ordered versions 71%'^'^ are coded 
in /ipord-a- If we denote the tree shape of T?.™'^ by T™'^, then the distribution 
P"'"^ of T"^"^ is P°''^'°, equipped with exchangeable leaf labels. TZf^ is the 
almost sure scaling limits of the reduced trees i?(T°''^, [k]) as n — > oo; see 
Proposition 7. 

On the other hand, we naturally define P™'^ to be uniform on the set ¥2^^*^ 
of two elements, and then P^^i directly from the sequential construction as 
the distribution of T°+i, which is T° with the new leaf added according 
to Ford's rule and labeled n + 1, that is, we label leaves in their order of 
appearance. In this setting, we also establish a.s. convergence of reduced 
subtrees. 

Proposition 18. (a) For all k>l, we have 

in the sense of Gromov-HausdorjJ convergence, where {TZ'^'^)k>i is an in- 
creasing family of leaf-labeled M-trees with edge lengths. 

(b) The distribution of 'R-^'^ is determined by the distributions of three 
independent random variables: (i) its shape T™ ~ P^ , (ii) its total length 
Sk with density 

r(fc + l-a) 

T{k/a) ^"^^' 

where ga{s) = —s~^~^'°'fa{s~^'°') is the Mittag-Leffler density derived from 
the stable density fa with Laplace transform e~^°' ; (iii) Dirichlet edge length 
proportions Dj^ = (£>[, ',..., Dl '^ -') ~ 2?(1, . . . , 1, (1 - a)/a, . . . , (1 - a)/a), 
where, in D^ , we first list the k — 1 inner edges, then the k leaf edges, each 
by depth-first search. 

(c) 7^™ is an inhomogeneous Markov process in its natural filtration 
{Ji.k)k>i- More precisely, given {T^^'^,Sk,Dk), the conditional distribution 
of T^^^]^ is that where the Ford insertion happens at an edge E^, sampled 
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from the distribution on edges induced by D^; Sk+i has conditional density 

given Ej. is an inner edge, let C^+i ~ Unif(0, 1), otherwise, C^+i ^ /3(1, (1 — 
a) /a), independently from S^+i] split E^ into its proportions C^+i and 1 — 
Cfc_|_i, Ck+i being closer to the root. This determines the proportions -Cfc+i- 

Proof. Fix A; > 1 and f^""^. For n>k, the reduced trees R{f°"^, [k]) all 
have the same shape as T^^'^. In the transition from n to n + 1, there may 
be no change of the reduced tree or one of the edge lengths may increase 
by 1. We can associate edges with 2k — 1 colors, where each edge in T^'^ 
represents a color (but not white, which is reserved for later). Edges have 
weights which increase. Initially {n = k), the weights are one for each inner 
edge and (1 — a)/a for each leaf edge, zero for white. Each round, we pick 
a color at random, according to the current weights, and apply an updating 
rule as follows. Whenever an edge of the reduced tree is chosen (we recognize 
Ford's rule), the weight of that edge is increased by 1 and also the weight 
of white is increased by (1 — a)/a. Whenever we pick white, the weight of 
white is increased by 1/a. 

This model contains the essence of a Chinese restaurant process (see, e.g., 
[36], Lecture 3). Specifically, if we further discriminate the white weight 
by colored numbers identifying the subtree on the reduced tree in which 
the new leaf is added, then these subtrees can be considered tables in a 
restaurant and their leaves are customers. Suppose, at stage n, m subtrees 
are present on R{T° ,[k]). Each new customer joins any occupied table 
i = 1,. . . ,m with probability (rij — a)/{n — a), where rij > 1 is the number 
of customers already sitting at that table, and chooses a new table with 
remaining probability {k + (m — l)a)/{n — a). This describes an (a, k — a) 
seating plan in the terminology of [36]. 

(a)-(b) By [36], Theorem 3.8, the total number of tables scaled by (n — fc)" 
(where n — fc is the number of customers at stage n) converges almost surely 
so that for the total length SJ^^ of R{t°"^, [k]), 



S^^ _ St^ - 2fc + 1 Sf ^ (n - kf 

n" ~ (n - A;)° 5W _2k + l "-" 



Sk 



and the distribution of Sk is as specified. 

In particular, if ignoring white, the total color weight still tends to infinity, 
even though it is asymptotically negligible against white weight. If we only 
record changes to the color weights, the restricted model still has the dynam- 
ics of the updating rule and so the pre-limiting proportions Dfc(n) converge 
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a.s. to the Dirichlet limit, as specified. Furthermore, S^ Dk{n)/n°^ — > SkD^ 
a.s. and, since the shape of reduced trees does not change (not even in the 
limit, as Dk has only positive entries a.s.), this implies convergence in the 
Gromov-Hausdorff sense. 

The independence of T'^'^'^, S^ and Dj^ can be seen by a conditioning ar- 
gument: the independence of {Sk,Dj^) from T^'^'^ follows since our argument 
actually gives us the conditional distribution of {Sk,Dk) given Tj^^'^, which 
does not depend on Tj^^'^. Similarly, Sj^ gives us the times at which the 
color weights change that leads to D/.; if we condition on {S^ )k<n<N, then 
we still observe the same dynamics of color weights and letting N ^ oo, we 
get independence of D^ from the cr-field Sk generated by {S^ )n>k with 
respect to which Sk is measurable. 

(c) Consider weight processes Wm leading to "^m*^ asl<7n<A; + l varies. 
First, note that for 1 <m < k, [Tf^ , Sm, Dm) is a measurable function of 
(r™ , 5fc,Z)fc)- Therefore, the Markov property is trivially satisfied. 

Now, let tfc_|_i G TT^^-^ be such that k + 1 was added to an inner edge of 
the subtree t^ of t^+i, without loss of generality, directly to the left of the 
trunk. We then wish to calculate the expectation 

= F{fk+i =tfc+i) / ••• / /(ei,...,e2fc,l-ei e2k,r) 

T{k + l + k{l-a)/a) 
^ (r((l-a)/a))^ 

x{ek+i---e2k{l-ei eafc)^^-^")/" 

^gair)dedr, 



r(fc + 2-a) (fc+i)/«^i. 
T{{k + l)/a) 



where e = (ei, . . . , 62^, 1 — ei — • • • — 62^) so as to identify the conditional 
distribution of (r^:^'\, C^+i, 5fc+i) given (f^'^'^, Dk,Sk) = (tfc, d, s), where d = 
(di, . . . , d2k-2, 1 — di — ■ ■ ■ — d2k-2)- We change variables 



_ dkS _ d2k-2S 
^k+2 — ) • • • ) e2A; — ) 

r r 



dics 

ei = 
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fii(l — c)s 
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r 


r 


r — s 

Cfc+l = 
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and calculal 
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by a development of the first row. This gives 

IE(/(^A;+l>'S'fc+l)l{r°^d=tfe+i}) 



^F(rr'i = tfc) 



/ 



dics di{l — c)s d2S 



ly ly ly 

d2k-2S {l-di- 



dk-is r — s dkS 

1 5 5 



d2k-2)s 



X K{du ■ ■ ■ d2k-2{l -di d2fc-2))^'-'"^/"5'/"-'ffa(5) 

for a positive constant K. We conclude that T^^_^i, Ck+i and Sk+i are con- 
ditionally independent and that 



nn 



rord 
fc+1 



^ord 



Cfc+ll-tfc 



'■ik,Dk = d,Sk = s) = ^di, 



/, 



Cfe+il^™ — tfc,-Dfc— d,Sfc- 



/; 



Sfc+i|f-d=tfc,Dfc=d,Sfc=s 



.(c) = 1, 



ga(^) 

ga{s)' 



Similarly, if A; + 1 was added to a leaf edge of the subtree t^ of t^+i, without 
loss of generality, to the left of the first leaf edge in the order of depth-first 
search, which we may furthermore assume to be adjacent to the trunk in t^, 
then there will be an additional (1 — c)^^~'^'^''°' in the change of variables 
since, now, 62 and e^+i take the roles of ei and 62, where Ck+i is now the 
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proportion of a leaf edge. We then get 

^{T^+i = tfe_|_i|r™ = tfc, Dk = d,Sk = s) = ^dk, 






Proposition 18 is a generalization and refinement of [36], Exercises 7.4.10- 
7.4.13 dealing with the tree growth process in a Brownian excursion, a = 1/2. 

Corollary 19. The counting process Nt = sup{A; > : 5^ < t}, t> 0, 
is a time-inhomogeneous renewal process with hazard function 

that is, the hazard rate is ht{y) at time t if the last renewal occurred at 
t-y>0. 

Note that since fs^+i\Sm=ziy) integrates to 1, we have, for all 2: > 0, 

,(i-)/«-i(, + y)^^(, ^y^ay= r((l -«)/«) ^^(,). 
a 

In the case a = 1/2, we have (1 — a)/a — 1 = so that the tilting coef- 
ficients disappear and we can apply this formula to get ht{y) = t. This is 
the Poisson line-breaking construction of the Brownian continuum random 
tree '?Aidous-(-3/2) = "^01(1-1/2 (see Aldous [2]), where the trees TZk ~ T^k are 
constructed sequentially by breaking a line at the times of a Poisson point 
process in the wedge {{x,t) :t > 0,0 < x < i} with unit intensity per unit 
square. The heights of points generate the branch points on the previously 
grown tree. 

Proposition 18(iii) can be interpreted as the line-breaking construction of 
the alpha model random tree. The inhomogeneous renewal process replaces 
the inhomogeneous Poisson arrival process at linearly increasing rate t. The 
branch points (heights of points in the point process) are no longer chosen 
uniformly as in the Brownian case, but with intensity skewed within each 
leaf edge, by the /3(1, (1 — a)/a) choice replacing the uniform. 

Denote by V^ the number of leaves (out of n) added in (or as) subtrees 
to the left of the spine connecting leaf 1 to the root. 

Proposition 20. We have a.s. 

y{n) 00 /fc-1 \ 

"' k=0 \i=0 J 
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where Wi ~ /3(1 — a,ia + 1 — a) are independent, i > 0, and, independently 
Ak, k> 0, are independent symmetric Bernoulli random variables. 

Proof. This is a consequence of the observation made in the proof of 
Proposition 18 that the partition of leaves according to subtrees is a Chinese 
restaurant. It is well known (see, e.g., [36]) that the table proportions are 
given by the products VFm(l — Wm-i) • • • {^ — Wq). At the time of their 
creation, each subtree has an equal chance to grow on the left- and right- 
hand side of the spine, hence the result. D 

The distribution of Vi is not a new distribution. It naturally arises in 
the more general context of size-biased sampling of Poisson point processes. 
Specifically, [35] identifies these atoms as the normalized jumps of a stable 

subordinator a with Laplace exponent A", tilted by ai " , that is, we can 
also express the distribution of Vi as 

T{l/a) 

See also [6, 36]. Recently, James, Lijoy and Pruenster [30] specified the 
density of Vi . 

In general, Vi does not have a uniform distribution as for a = 1/2. Also, 

(k) ~ A 

V^ is not independent of T^ . For example, for A; = 3, with two different 
shapes, 

F{V{^^ = Olt^"^ =\) = -^— and F{vl^^ = 0\f^''^ =y^) = '^f-^ 

and these coincide if and only if a = 1/2. 

5.4. Stable trees. Duquesne and Le Gall [17] introduced a CRT that they 
called the stable tree '?^tabie-a ^^ index a G (1,2], which describes the ge- 
nealogy of a (continuous-state) stable branching process with a single in- 
finitesimal ancestor conditioned to have unit total family size (integral of 
population sizes over time). For a = 2, this is Aldous's Brownian continuum 
random tree, associated with Feller's diffusion. They have given the explicit 
distribution of the tree R{T^^^y^_^, Li, . . . , L„) spanned by n uniformly sam- 
pled leaves as follows. In fact, this identification of the finite-dimensional 
marginal distributions of '?^tabic-a ™^y ^e taken as an alternative definition 
of the stable tree. 

Proposition 21 (Theorem 3.3.3 of [17]). (i) Denote the shape of 
Ri'r^lL~a^Lr,...,Ln)byT:i^^. Then 

p/T^ord_, ^ _ Qr(l - 1/a) yr (a-l)r(r^-a) 
^" ''^~nn-l/a)J±^_^ rjr(2-a) 
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for any t„ G T° , where r„ is the number of children of vertex f G t„,. 

(ii) Given T"^"^ = t„, the total length Sn and the edge length proportions 
Dn are conditionally independent; Dn has a P(l, . . . , 1) distribution on vec- 
tors of length Z = |t„| — 1, the number of edges o/t„; Sn has density 

/c irord-t (s) = -^.V .J,,. ictsY M n"*" '^rjias,! - u)du, 
Jb^\i„ -t„v J T{6tjT{l) Jo 

where 5t„ = n — 1/a + (1 — l/a)l and r]{t,v) is the density of a {1 — 1/a)- 
stable subordinator (cri,t>0) with Laplace exponent exp{— A^~^'"}. 

There are a number of direct consequences. 

Corollary 22 (Theorem 3.2.1 of [17], Lemma 5 of [33]). (i) The tree 
shape without leaf labels, T°'^'^'° , is a Galton-Watson tree conditioned to have 
n leaves, whose offspring distribution has probability generating function z + 
a-i(l-z)°. 

(ii) The unordered tree shapes T°, n>l, form a strongly sampling con- 
sistent family of Markov branching models with splitting rule 

., u ^ _ gfci....,A:.r(2 - l/a)a-(--2)r(r - a) ^ T{k, - l/a) 

^ '^" r(n-l/a)r(2-a) ^^ r(l - l/a) 

for any r > 2, ki> ■ ■ ■ >kr>l, where C/c^^...^fc,. is the combinatorial constant 
given in (3). 

Miermont [33, 34] studies fragmentation processes associated with T^tabic-a 
and identifies the associated dislocation measure. 

Proposition 23 ([33]). Let {ax,x>0) be a stable subordinator with 
Laplace exponent A^'". Denote by Aup^i] = {Aax,x E [0,1])-'' the jump sizes 
Aax = (Tx — o'x- in decreasing order. Then T^tabic-o is a (1 — l/a)-self- 
similar fragmentation CRT with dislocation measure 

, , , a^T(2 - l/a) / Amo n 

J^stablc-a(as) = — — — E CTi; ^^ G ds 

r(2 - a) V 0-1 

The associated Levy measure (18) of the tagged particle subordinator is 

Astable-a(dx) = :f^r^(l " e--)l/°-2e"(l-l/«)- dx. 

r(i/a) 

By virtue of (20), which is equivalent to (6), the dislocation measure sat- 
isfies the regular variation condition with l{x) ~ a/r(l/a) and also satisfies 
(7) for any p > because the density of Agtabie-a decays exponentially as 
x — > cxD (see also the discussion in Section 4.2). Therefore, we can apply 
Theorems 2 and 15. 
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Corollary 24. For a strongly sampling consistent family of trees T°, 
n > 1, from the Markov branching model with splitting rules identified in 
Corollary 22{\\) for some 1 < a < 2, we have 

^ ^ Ir^ ^ -'stable— a 

for the Gromov-Hausdorff metric. Furthermore, the associated rescaled leaf- 
height functions (33) converge to the associated limiting height function (see 
Section 4-3) 

Kit) \ (p) ,, .... 

""n^^y^ ) o<t<i n^^^^i- V",-.taMo-. {t)h<t<i 

for the uniform norm. 

It is known ([17]) that '7;tabic-2 ~ 27]^idous-(-3/2)- Here, doubling a frag- 
mentation CRT (i.e., all distances, or the associated height function) corre- 
sponds to halving the fragmentation rates. Also, for a G (1,2), the factor a 
can be built into the limiting CRT as 'T^tabie-o/o^j which is the CRT associ- 
ated with the dislocation measure ai^stabie-a- 

Several papers study the convergence of conditioned discrete Galton- 
Watson trees. There are several different schemes of conditioning. The clos- 
est to our setting is conditioning on the total number of vertices in a 
Galton-Watson tree with offspring distribution in the domain of attrac- 
tion of a stable law. Geiger and Kauffmann [23] study the convergence 
of reduced subtrees and show that the unconditional total length of TZ^ 
has a Gamma(A; + 1/a, 1) distribution. Duquesne [16] establishes the con- 
vergence of associated height functions to the stable tree. It is not sur- 
prising that conditioning on the total number of leaves or the total num- 
ber of vertices leads to the same limit, when suitably rescaled, since there 
are at most twice as many vertices as leaves and the ratio converges to a 
a.s. 

Marchal [32] has a sequential construction of the shapes of the reduced 
stable tree similar to Ford's sequential construction of the alpha model. 
Marchal associates weights 1 — 1/a with each edge, but also puts weight 
k/a — 1 onto any vertex with k subtrees. These weights also sum to n — 1/a 
at growth stage n. At each growth stage, an edge or vertex is chosen accord- 
ing to these weights and a new leaf edge added, either with an additional 
vertex in the "middle" of the edge or just attaching in the vertex increasing 
its number of subtrees by 1. 

Acknowledgments. Thanks are due to Daniel Ford for discussing his al- 
pha model with us at an early stage of his work. We would also like to 
thank two referees for valuable comments that led to an improvement of the 
presentation. 
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